TYMSHARE MANUALS 
TYMCOM-X 



STAT PA K 



FEBRUARY 1972 



TYMSHARE, INC. 
CUPERTINO, CALI FORNI A 95014 



, TYMSHARE, INC., Litho in U.S.A. 



Ill 



CONTENTS 

Page 

SECTION 1 - INTRODUCTION 1 

SECTION 2 - USING STATPAK 3 

ERROR CORRECTION 4 

DATA FILES 4 

SECTION 3 - CREATING AND MANIPULATING DATA: 

THE CREME MODULE 7 

THE CRATE PROGRAM 7 

THE MODIF PROGRAM 9 

THE MERGE PROGRAM 12 

THE TRANS PROGRAM 15 



SECTION 4 - BASIC STATISTICAL ANALYSES: 

THE DATA MODULE 19 



SECTION 5 - REGRESSION ANALYSES: 

THE REGON MODULE 23 

THE DREGR PROGRAM 23 

THE RGSTP PROGRAM 27 

THE RGPOL PROGRAM 33 

SECTION 6 - PARAMETRIC HYPOTHESIS TESTING: 

THE PARH MODULE 41 

THE TSTAT PROGRAM 41 

THE FSTAT PROGRAM 44 

SECTION 7 - NON-PARAMETRIC HYPOTHESIS TESTING: 

THE NPARH MODULE 47 

THE UTEST PROGRAM 47 

THE CRANK PROGRAM 49 

THE CCORD PROGRAM 50 

SECTION 8 - TIME SERIES ANALYSIS: 

THE TIMSA MODULE 53 

THE TRIXP PROGRAM 53 

THE XPOSE PROGRAM 55 



J4.00 RM1 



IV 



Page 

SECTION 9 - DISCRIMINANT ANALYSIS: 

THE DISCA MODULE 63 



SECTION 10 - VARIANCE AND FACTOR ANALYSIS: 

THE AVANC MODULE 71 

THE ANVAR PROGRAM 71 

THE FCTOR PROGRAM 76 

APPENDIX - ERROR MESSAGES 83 



SECTION 1 
INTRODUCTION 

Tymshare's statistical package for use on the TYMCOM-X computer is called STATPAK. 
The major uses of STATPAK are in business and financial applications such as management 
science, operations research, market analysis, financial analysis, and investment analysis. 
STATPAK is also a convenient tool for scientists and engineers involved in statistical work. 
Tasks that can be performed by STATPAK include: 

• Data creation and editing. 

• Data screening and plotting. 

• Data transformations, including square root, natural and common logarithms, exponential, 
inverse, etc. 

• Parametric and non-parametric hypothesis testing. 

• Regression analysis. 

• Time series analysis. 

• Discriminant analysis. 

• Analysis of variance and factor analysis. 

A useful feature of STATPAK is the ability to store on a file results from a STATPAK 
analysis for use by a later STATPAK analysis or by a user-written program. 

STATPAK contains eight modules and eighteen programs to perform statistical analyses. 
These are listed on the following page. 

Section 2 of this manual describes the general use of STATPAK, including the creation of 
data files. Sections 3 through 10 describe each module and program with a sample execution 
of each. The Appendix contains a list of all STATPAK error messages and their explanations. 

In all examples throughout this manual, everything typed by the user is underlined. User- 
typed Carriage Returns are represented by the symbol i>. 

Control characters are denoted in this manual by a superscript c. For example, A c denotes 
Control A. The method of typing a control character depends upon the type of terminal being 
used. Consult the literature for your particular terminal. 



Module Name 


Program Name 


Description 


CREME 


CRATE 


Creates a data file in the proper format for all STATPAK 
programs except ANVAR. 


MODIF 


Allows the user to modify a data file. 


MERGE 


Allows the user to select data from two Or more files and 
to merge this data on a new data file. 


TRANS 


Performs 12 data transformations. 


DATA 


DSCRE 


Generates basic statistics, performs conditional and uncon- 
ditional analyses, and plots a histogram. 


REGON 


DREGR 


Performs a multiple linear regression analysis. 


RGSTP 


Performs a stepwise multiple linear regression analysis. 


RGPOL 


Performs a polynomial regression analysis. 


PARH 


TSTAT 


Computes the student's t-statistic under four different 
hypotheses. 


FSTAT 


Computes the F-ratio and corresponding degrees of 
freedom. 


NPARH 


UTEST 


Tests the hypothesis that two independent samples come 
from the same population. 


CRANK 


Computes Kendall's rank correlation coefficient between 
two variables. 


CCORD 


Computes the degree of association among several variables 
(the concordance coefficient) and the chi-square statistic. 


TIMSA 


TRIXP 


Makes a forecast of a variable based on a set of past 
observations. 


XPOSE 


Makes a forecast of a variable using exponentially weighted 
moving averages; includes linear trend and seasonal factors. 


DISCA 


DANSS 


Performs a discriminant analysis and generates linear dis- 
criminant functions. 


AVANC 


ANVAR 


Performs an analysis of variance. 


FCTOR 


Performs a factor analysis. 



SECTION 2 
USING STATPAK 

STATPAK is called from XEXEC 1 by typing: 
- R STATPAK^ 

STATPAK then requests the name of the module desired by the user. For example, 

- R STATPAK t> 

TVMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? 



The user may respond with the name of any one of the eight STATPAK modules, or he 
may type HELP and a Carriage Return for a list of the modules and the programs within each. 

After the module is specified, STATPAK requests the name of the program to be called 
from that module. For example, 



- R STATPAK 3 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CREME p 
PROGRAM NAME (TYPE "HELP" FOR AID)? 



The user may respond to the last question with the name of a program in CREME, or he may 
type HELP for a list of all programs and modules. 

STATPAK asks a series of questions during each analysis. Each response must be terminated 
by a Carriage Return. If the response consists of two or more numbers, those numbers must 
be separated by spaces. 

When a module has run to completion, STATPAK requests the next module name. The user 
may type STOP to return to XEXEC. 



1 - Refer to the Tymshare TYMCOM-X XEXEC Reference Manual for a description of XEXEC. 



Error Correction 

While executing STATPAK programs, the user can correct his entries by using Control A 
or Control Q. If he types an incorrect character in STATPAK, he may delete it with a Con- 
trol A. The first time Control A is typed, a back slash (\) is printed, followed by a reprint of 
the deleted character. This key may be used repeatedly to delete successive characters to the 
left; each time Control A is typed, the next deleted character is reprinted. When all incorrect 
characters have been deleted and normal typing resumes, another back slash and the first new 
correct character are printed. For example, 

ABXCPA C \ PA£ CA£XC \ D CORP ? 

is interpreted as 

ABCD CORP 

Note that the user types the C before the second back slash is printed; however, the C appears 
on the right side of the back slash in the actual printout. 

Control Q deletes an entire response. On some terminals, Control Q echoes as an up arrow. 
For example, 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (U/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CMRE t 
CREMEp 

PROGRAM NAME (TYPE "HELP" FOR AID)? 



The user types CMRE and a Control Q. The carriage is returned by the system, and the user 
enters the correct response. 

Data Files 

All programs in STATPAK, except ANVAR 1 in module AVANC, accept data from a file 
written in the following format: 

N M 
variable name t 

x l x 2 x 3 • • • X N 
variable name 2 

x l x 2 x 3 • • • X N 

variable name M 
x l X 2 X 3 • • • X N 



1 - The ANVAR program is described on page 7 1 . 



where N is the number of observations for each variable. 

M is the number of variables. 

x is a data entry in free format. There may be as many as six data entries in each line. 

NOTE: A variable name may have up to five alphanumeric characters. No variable can be 
named STOP. 

The XEXEC TYPE command is used below to list a sample data file. 
- TYPE DFILE 7, 

3 2 There are two variables with 

H G T three observations per variable. 

66 70 64.5 

WGT 

120 135 115 



The user may create a data file in one of the editing languages, EDITOR or TECO, 1 or he 
may use the CREME module, described on page 7. The example below demonstrates the 
creation of a data file in EDITOR. 



- EDI TOR j The user calls EDITOR from XEXEC. 

* APPENDp The APPEND command indicates that he wishes to enter data. 
10 3 3 

AGE 7, 

46 24 32 41 50 63 ^ 

29 28 52 36 7, 

HGTp 

64 72 71 68 65 75 ^ 

70 64 77 67 p 

VGT;p 

173 170 154 129 192 203 ^ 

1 22 1 36 1 47 1 53 t> The APPEND command is terminated by a Control D. 

* WRITE SDATA 7) The contents of EDITOR are written on the file SDA TA. 
N EW F I L E p The NEW FILE message indicates the user is creating a 

new file. He confirms this by typing a Carriage Return} 

130 CHARACTERS 

* QUIT} 

The QUIT command returns the user to XEXEC, indicated by the dash. 



1 - See the Tymshare TYMCOM-X TECO Reference Manual or the Tymshare EDITOR Reference Manual for more information. 

2 - The OLD FILE message here indicates that a file already exists by the specified name. A Carriage Return writes over the previous 

contents. An Alt Mode/Escape returns the user to EDITOR command level, indicated by the asterisk. He then reexecutes the 
WRITE command with another file name. 



SECTION 3 

CREATING AND MANIPULATING DATA: 
THE CREME MODULE 



The CREME module consists of the programs CRATE, MODIF, MERGE, and TRANS, 
which perform the following functions: 

• CRATE creates a data file. 

• MODIF allows the user to edit a data file. 

• MERGE allows the user to merge data from two or more data files. 

• TRANS performs data transformations. 

MODIF, MERGE, and TRANS accept data from a file created by CRATE or written in the 
proper format in one of the editing languages. The format is described in Section 2. 

The CRATE Program 

CRATE creates a data file in the proper format for all STATPAK programs except ANVAR. 1 
The example below demonstrates the use of the CRATE program. 

Example 

The user calls CRATE to create the file PAYRL, containing three variables, LEVEL, SALRY, 
^ and HOURS. 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CREME;) 
PROGRAM NAME (TYPE "HELP" FOR AID)? CRATE ? 



DATA FILE CREATION PROGRAM The file created by STATPAK is 

FILE NAME (5 OR LESS CHAR.)? PAYRL p actually PAYRL.DAT. Refer to the 

Tymshare TYMCOM-X XEXEC 
HOW MANY VARIABLES?3p Reference Manual for a description 

of file name extensions. 

HOW MANY OBSERVATIONS PER VARIABLE? lOp There are three variables with 

ten observations per variable. 

NAME OF VARIABLE 1 ( 5 OR LESS CHAR.)? LEVEL ? The user names each 

variable and enters all 
observations for that 
variable. 



1 - The ANVAR program is described on page 71. 



ENTER OBSERVATIONS 6 TO A LINE 

? 12 5 5 4 3 ^) The user separates the observations by spaces. 

?3_3_J_Jp 

NAME OF VARIABLE 2 ( 5 OR LESS CHAR.)? SALRV ^ 

ENTER OBSERVATIONS 6 TO A LINE 
? 5 4.25 2*6 2.75 3.15 3.70 ? 

? 3.65 4 4.75 4.65 -p 

NAME OF VARIABLE 3 ( 5 OR LESS CHAR.)? HOURS ? 

ENTER OBSERVATIONS 6 TO A LINE 
? 45 40 40 38 35 40 ? 

? 40_45_35_40p 



THATS ALL FOR FILE PAYRL 
CREATE ANOTHER FILE? NO d 



The user does not wish to create another file. 



MODULE NAME (TYPE "HELP" FOR AID)? STOP;) 



EXECUTION TIME: 38.64 SEC. 

TOTAL ELAPSED TIME: 2 MIN. 19.10 SEC 

NO EXECUTION ERRORS DETECTED 



Control returns to XEXEC. 
The execution times are printed. 



EXIT 

- TYPE PAYRL. DA T -p 



The file created by CRA TE is printed. Note 
that values are stored as decimal numbers. 



10 3 

LEVEL 

1.000000 

3.000000 
3.000000 
SALRY 

5.000000 

3.700000 
3.650000 
HOURS 

45.00000 

40.00000 
40.00000 



2.000000 
3.000000 
4.250000 
4.000000 
40.00000 
45.00000 



5.000000 
1.000000 
2.600000 
4.750000 
40.00000 
35.00000 



5.000000 
1.000000 
2.750000 
4.8 50000 
38.00000 
40.00000 



4.000000 



3.150000 



35.00000 



g 



The MODIF Program 

This program allows the user to edit a data file. The editing functions available in MODIF are: 

Function Description 

ADDON Adds a specified number of observations to each variable in the data file. 

CHANG Changes a variable name. 

CORCT Corrects individual data elements in the file. 

DELET Deletes a specified number of observations from the beginning or end for 
each variable. 

EXPND Allows the user to add a new variable with its observations. 

LIST Lists the data file on the terminal. 

REMOV Removes a variable and its observations. 

Each of these functions is demonstrated in the example below. 

Example 

In this example, the user reads the file SDATA and edits the contents using MODIF. The 
user lists the file which contains variables AGE, HGT, and WGT, with ten observations each. 
The user adds two observations to each variable using the ADDON function. Then he changes 
the name of variable HGT to HT. Using the CORCT function, the user changes the eleventh 
observation for variable WGT, just entered, from 180 to 186. The user then deletes the first 
observation of each variable using DELET. A new variable, SEX, is added, and an old variable, 
AGE, is deleted. Finally, the user lists the file with all its modifications. 

- R STATPAK ^ 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CREME ^ 

PROGRAM NAME (TYPE "HELP" FOR AID)? MODIF Y The user calls the 

MODIF program. 
FILE NAME? SDATAp 

FUNCTION? LIST p The user lists file SDATA. 



10 



AGE 



HGT 



WGT 



10 

46.00000 

63.00000 
29.00000 

64.00000 

75.00000 
70.00000 

173.0000 

203.0000 
122.0000 



24.00000 
28.00000 
72.00000 
64.00000 
170.0000 
136.0000 



32.00000 
52.00000 
71.00000 
77.00000 
-154.0000 
147.0000 



41.00000 
36.00000 
68.00000 
67.00000 
129.0000 
153.0000 



MORE MODIFICATIONS? YES -p 

FILE NAME? SDATA ^ 
FUNCTION? ADDON 7) The user requests the ADDON function. 

HOW MANY OBSERVATIONS TO BE ADDED? 2_;p 



ENTER NEW OBSERVATIONS FOR VARIABLE AGE 
?42_47p 

ENTER NEW OBSERVATIONS FOR VARIABLE HGT 
?60_59^ 

ENTER NEW OBSERVATIONS FOR VARIABLE WGT 
?180_l_75p 



6 TO A LINE 
6 TO A LINE 
6 TO A LINE 



MORE MODIFICATIONS? YES ? 

FILE NAME? SDATA ^ 
F UNC T I ON? CHANG ? The user requests the CHANG function. 

OLD VARIABLE NAME? HGTp 

NEW VARIABLE NAME< 5 OR LESS CHAR.)? HT? 
CHANGE ANOTHER VARIABLE NAME? NO? 
MORE MODIFICATIONS? YES ? 

FILE NAME? SDATA ? 
F UNC T I ON? CORCT ? The user requests the CORCT function. 

VARIABLE NAME? WGT ? 
LINE NUMBER<TYPE TO STOP)? 2.2 



50.00000 



65.00000 



192.0000 



The user changes the variable 
name HGT to HT. 
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WHICH ENTRY IN LINE 2 ? _5p 



WHAT SHOULD IT BE? I86p 



77ze /if/if/j entry on line 2 is modified. 



The user changes the eleventh ob- 
servation for variable WGT to 186. 

ANOTHER ENTRY ON SAME LINE? NOp 

CORRECT ANOTHER VARIABLE? NOp 

MORE MODIFICATIONS? YES p 

FILE NAME? SDATAp 

F UNC T I ON? DELET 7) The user requests the DELET function. 

HOW MANY OBSERVATIONS TO BE DELETED? 

>0 *»> DROP OFF END; <0 «> DROP FROM BEGINNING): Mp 

The user deletes the first observation of each variable. 
MORE MODIFICATIONS? YESp 

FILE NAME? SDATAp 

FUNCTION? EXPNDp 

HOW MANY VARIABLES TO BE ADDED? J_p 



REMINDER 11 OBSERVATIONS PER VARIABLE. 
NEW VARIABLE NAME< 5 OR LESS CHAR.)? SEX ^ 

? Q I 1 1 I d 

? 1 1 Q ti 



There are 11 observations because 
the user has added 2 to the initial 
10 observations, and deleted the 
first observation of each variable. 



THATS ALL FOR FILE SDATAp 
MORE MODIFICATIONS? YESp 

FILE NAME? SDATA p 
FUNCTION? REMOV p The user removes the variable A GE. 

NAME OF VARIABLE TO BE REMOVED? AGEp 
REMOVE ANOTHER VARIABLE? NOp 
MORE MODIFICATIONS? YES p 

FILE NAME? SDATAp 
FUNCTION? LIST p The user lists the modified file. 
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HT 



11 

72.00000 

70.00000 
64.00000 



WGT 



170.0000 

122.0000 
136.0000 



71.00000 
77.00000 

154.0000 
147.0000 



SEX 

0.0000000 0.0000000 
1.000000 
1.000000 0.0000000 

MORE MODIFICATIONS? NOp 



68.00000 
67.00000 

129.0000 
153.0000 

1.000000 
0.0000000 



65.00000 
60.00000 

192.0000 
186.0000 

1.000000 
1.000000 



75.00000 
59.00000 

203.0000 
175.0000 

1 .000000 
0.0000000 



MODULE NAME (TYPE "HELP" FOR AID)? 



The user may type another module name. 
He may type STOP to return to XEXEC. 



The MERGE Program 

This program allows the user to merge selected variables from different data bases into one 
data base. In order to merge variables, each data file must have the same number of observa- 
tions per variable. If this is not the case, MERGE excludes the file(s) with an appropriate 
message. 

Example 

The user wishes to merge variables from four files. From file STAT1 variables VAR1 and 
VAR3 are desired, from STAT2 variable VAR5, from STAT3 variable VAR7, and from STAT4 
variable VAR8 is desired. The merge information is to be saved on file STATS. Note that 
STAT4 has a different number of observations and therefore is excluded from the merge. 

The contents of the four files STAT1, STAT2, STAT 3, and STAT4 are listed with TYPE 
commands. 

- TYPE STATl p 



10 3 
VARl 
12 3 4 5 6 

7 8 9 10 
VAR2 

3 9 12 15 19 21 

24 27 30 33 

VAR3 

5 4 6 3 2 9 

8 4 5 2 
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-TYPE_STAT2p 



10 2 

VAR4 

3 7 8 5 2 5 

9 4 5 3 

VAR5 

3 4 5 8 9 17 

2 34 12 5 



- TYPE STAT3 p 



10 2 

VAR6 

7 6 3 2 19 

7 4 3 2 

VAR7 

23 21 28 32 23 24 

21 26 27 23 



- TYPE STAT4 ^> 



15 1 

VAR8 

1 2 3 4 5 6 

7 8 9 10 11 12 

13 14 15 



- R STATPAK ^ 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CREMEp 

PROGRAM NAME (TYPE "HELP" FOR AID)? MERGE ? The user requests the 

MERGE program. 

NAME OF OUTPUT FILE.?STATSp The file STATS.DAT will contain the merged results. 

NAME OF INPUT FILE?. TYPE STOP WHEN DONE. 
STAT1 ? 

TYPE VARIABLE NAMES IN RESPONSE TO ?.TYPE STOP WHEN DONE. 



14 

? VARl r> 

? VAR3 ? 

? STOP p 

NAME OF INPUT FILE?. TYPE STOP WHEN DONE. 
STAT2 p 

TYPE VARIABLE NAMES IN RESPONSE TO ?.TYPE STOP WHEN DONE. 

? VAR5 p 

?STOP^ 

NAME OF INPUT FILE?. TYPE STOP WHEN DONE. 
STATJJp 

TYPE VARIABLE NAMES IN RESPONSE TO ?.TYPE STOP WHEN DONE. 

? VAR7 7) 

? STOP ? 

NAME OF INPUT FILE?. TYPE STOP WHEN DONE. 
STAT/t p 

TYPE VARIABLE NAMES IN RESPONSE TO ?.TYPE STOP WHEN DONE. 
?VAR8^ 

INCOMPATIBLE NO. OF OBSERVATIONS 15 IN FILE STAT4. 
FILE EXCLUDED IN MERGE ROUTINE. 

NAME OF INPUT FILE?. TYPE STOP WHEN DONE. 
STOP p 

MODULE NAME (TYPE "HELP" FOR AID)? STOP p 



EXECUTION TIME: 57.31 SEC 

TOTAL ELAPSED TIME! 2 MIN. 6.17 SEC. 

NO EXECUTION ERRORS DETECTED 

EXIT 
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- TYPE STATS* DA T p 



The file STA TS.DA T contains the results of the MERGE routine. 



10 



VAR1 

1.000000 

6*000000 
7.000000 
VAR3 

5.000000 

9.000000 
8.000000 
VAR5 

3.000000 

17.00000 
2.000000 
VAR7 

23.00000 

24.00000 
21.00000 



2.000000 
8.000000 
4.000000 
4.000000 
4.000000 
34.00000 
21.00000 
26.00000 



3.000000 
9.000000 
6.000000 
5.000000 
5.000000 
12.00000 
28.00000 
27.00000 



4.000000 
10.00000 
3.000000 
2.000000 
8.000000 
5.000000 
32.00000 
23.00000 



5.000000 



2.000000 



9.000000 



23.00000 



The TRANS Program 

TRANS allows the user to create a data file that is a simple function of his original data file. 
The input data file is limited to 60 variables and 500 observations per variable. The total num- 
ber of observations on the file may not exceed 4000. 

The transformations that TRANS can perform are listed in the table below. 



Name 



ABS 

SQR 

PWR 

CPW 

INV 

AMC 

EXP 

LOG 

LGT 

LIN 

MUL 

DIV 



Function 



Ixl 
Vx- 
Cx D 
CD X 
C/x 
C + Dx 
Ce x 
log e x 

lo Sio x 
Cx+Dy 

Cxy 

Cx/Dy 



Comments 



Absolute value. 

Square root; x must be greater than or equal to zero. 



x must not equal zero. 

Exponential; e is base of natural logarithm. 
Natural logarithm; x must not equal zero. 
Base 10 logarithm; x must not equal zero. 



y must not equal zero. 
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where x and y are variables, and 

C and D are constants supplied by the user. 

As an example, the LIN transformation, Cx+Dy, is performed on the first set of data below 
to produce the second set. The constants, C and D, are set to 2 and 3, respectively. 

Original Data 

variable x: 3 4 5 8 
variable y: 4 2 6 7 

Transformed Data 

variable x: 18 14 28 37 
variable y. 4 2 6 7 

Thus, the observations of variable x are replaced by 2x+3y, and the observations of y are 
unchanged. 

Example 

In the example below, the user reads a data file that has four variables with ten observations 
each. The variables and values are listed below. 



VI 


V2 


V3 


V4 


5 


32 


10 


56 


10 


14 


11 


58 


12 


76 


12 


60 


14 


20 


13 


62 


11 


12 


14 


64 


9 


42 


15 


60 


7 


15 


16 


42 


16 


76 


17 


34 


20 


24 


18 


47 


12 


18 


19 


78 



The user transforms VI to V1+V3, V2 to V2/V3, and V3 to 3(V3) 2 . He writes the trans- 
formed data on file TRANF. The original file, VARFL, is unchanged. 
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- TYPE VARFL p 



The contents of the file VARFL are displayed. 



10 

VI 

5 1 

7 1 

V2 

32 

15 

V3 

10 

16 

V4 

56 

42 



12 14 11 9 
6 20 12 

14 76 20 12 42 
76 24 18 

11 12 13 14 15 
17 18 19 

58 60 62 64 60 
34 47 78 



- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? CREME ^ 

PROGRAM NAME (TYPE "HELP" FOR AID)? TRANS p 

ENTER FILE NAME? VARFL p The original data is found on file VARFL. 

ENTER NAME OF VARIABLE X? Vl ? 

TRANSFORMATION TYPE? LIN p 

ENTER NAME OF VARIABLE Y? V3? 

ENTER VALUES OF CONSTANTS C AND D? 1 1 ? 

ENTER NAME OF VARIABLE X? V2^ 

TRANSFORMATION TYPE? DIV p 

ENTER NAME OF VARIABLE Y? V3^> 

ENTER VALUES OF CONSTANTS C AND D? 1 I d 

ENTER NAME OF VARIABLE X? V3^ 

TRANSFORMATION TYPE? PWR p 

ENTER VALUES OF CONSTANTS C AND D? 3 2 ? 

ENTER NAME OF VARIABLE X? STOP :-. 

ENTER OUTPUT FILE NAME? TRANF p 



The user replaces VI with VI + V3. 



The user replaces V2 with V2/V3. 



The user replaces V3 with 3( V3J 2 . 



The user replies with STOP when he is finished 

making transformations. 

The transformed data is written on file TRANF. 
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MODULE NAME (TYPE "HELP" FOR AID)? STOP 71 



EXECUTION TIMEt 52.90 SEC. 

TOTAL ELAPSED TIME: 1 MIN. 44.42 SEC. 

NO EXECUTION ERRORS DETECTED 

EXIT 



- TYPE TRANF.DAT p 



The user lists his transformed data. 



10 



VI 



V2 



V3 



V4 



15.00000 

24.00000 
23.00000 

3.200000 

2.800000 
0.9375000 

300.0000 

675.0000 
768.0000 

56.00000 

60.00000 
42.00000 



21.00000 
33.00000 
1.272727 
4.470588 
363.0000 
867.0000 
58.00000 
34.00000 



24.00000 
38.00000 
6.333333 
1.333333 
432.0000 
972.0000 
60.00000 
47.00000 



27.00000 
31.00000 
1.538462 
0.9473684 
507.0000 
1083.000 
62.00000 
78.00000 



25.00000 



0.8571429 



588.0000 



64.00000 



19 



SECTION 4 

BASIC STATISTICAL ANALYSES: 
THE DATA MODULE 



The DATA module has only one program, DSCRE, which allows data screening, optionally 
prints a histogram and frequency table, and generates basic statistics. 

This program accepts data from a data file written in the standard format for STATPAK, 
described on page 4. DSCRE accepts a maximum of 40 variables and 75 observations per 
variable. 

The user selects a variable for analysis. He may screen the observations to be used for this 
variable by stipulating the conditions by which observations are chosen. 

For example, the user selects for analysis the variable CASH from the data file below. 

4 3 

DEBT 

24.5 32.35 37.2 26.75 

CRED 

730.0 680.0 645.0 752.5 

CASH 

198.25 162.0 170.0 156.5 

The user may stipulate that CRED observations be greater than 650 and DEBT observations be 
greater than 25 using the conditions GT 650 and GT 25. This causes the following CASH ob- 
servations to be used for analysis: 162.0 and 156.5. 

The conditions that may be used in DSCRE are: 

GT greater than 

GE greater than or equal to 

LT less than 

LE less than or equal to 

EQ equal to 

NE not equal to 

The user may enter a maximum of 30 conditions for data screening. More than one condition 
corresponding to the same variable can be entered on the same line. For example, 

LT 20 GT 15 

After the user has selected the variable and observations to be analyzed, he may print or 
save the chosen subset of data. DSCRE then computes and prints the following: 

• Histogram of up to 12 frequency classes (optional). 

• Statistics summary including maximum, minimum, average,, median, variance, and standard 
deviation. 

• Frequency table of up to 12 frequency classes (optional). 

• Chi-square measure of goodness-of-fit. 
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Example 

The user chooses variable TOTAL from the file ACCT for analysis. He selects those obser- 
vations of TOTAL that correspond to zero values of PAST and values between and 3999, 
inclusive of variable PRES. The user saves this subset of data on file XACCT.DAT. The user 
chooses to print a histogram and frequency table as well as the basic statistics. 



- TYPE ACCT d 



30 3 

PRES 

842.21 0.10 5146.67 355.31 .00 2824.94 

3849.72 .00 1198.91 .00 .00 .00 

271.00 .00 .00 .00 86.91 1060.10 

425.33 18.78 .00 1582.87 734.30 160.93 

11.77 .00 -203.65 2874.29 .00 80.00 

PAST 

156.45 .00 248.53 336.18 

.00 .00 69.12 25.00 

30 .00 .00 .00 .00 

.00 .00 .00 .00 
48 .00 .00 270.00 .00 
TOTAL 

842.21 0.10 6401.15 355.31 248.53 4000.30 
3849.70 .00 1198.91 549.58 232.36 25.00 
271.00 207.30 .00 .00 86.91 1060.10 
425.33 18.78 .00 1582.82 734.30 160.93 
20.74 92.48 -203.65 4392.70 613.67 210.00 



.00 


.00 


.00 


.00 


.00 


207 


.00 


.00 


.00 


92. 
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- R STATPAK ^ 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? DATA ? 

PROGRAM NAME (TYPE "HELP" FOR AID)? DSCRE ^ 

ENTER THE NAME OF DATA FILE? ACCT p The file ACCT contains the original data. 

NAME OF VARIABLE CHOSEN FOR ANALYSI S? .TOTALp 

ARE THERE ANY VARIABLES SUBJECT TO CONDI TIONS? .YES.p 

START ENTERING CONDITIONS NOW. 

ENTER NAME OF VARI ABLE? PRES p 

ENTER CODE OF CONDITION AND VALUE? GE LT 4000 ^ 

ENTER NAME OF VARI ABLE7PAST? 

ENTER CODE OF CONDITION AND VALUE? EQ O 7, 

ENTER NAME OF VARI ABLE? STOP p The user types STOP to stop entering conditions. 

DO YOU WANT A LISTING OF THE SUBSET VECTOR?. Wz> 

STORE THE SUBMATRIX OF OBSERVATIONS.? 
TYPE FILENAME OTHERWISE TYPE NO. 
XACCT p The file XA CCT. DA T contains the submatrix. 

DO YOU WANT A HI STOGRAM? .YES_p 

DO YOU WANT A FREQUENCY TABLE.? YES? 

ENTER NO OF FREQ CLASSES? L2p 

DATA SCREENING PROBLEM 1 

SUMMARY STATISTICS FOR VARIABLE TOTAL 

TOT= 15759.42 AVER= 750.4486 STD DEVIATION * 1182.519 

MINIMUM « 0.0000000 MAXIMUM » 4392.700 VARIANC= 1398351. 

MEDIAN ■ 320.3010 
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FREQUENCY 



12 



1 1 



HISTOGRAM 1 
OOOOO 1 



12 
11 
10 

9 
8 
7 
6 
5 
4 
3 
2 
1 



* 
* 
* 

* 
* 
* 
* 
* 
* 
* 
* 
* 



* 
* 



* 
* 

* 



I NTERVAL 
CLASS 



8 9 10 11 



12 





FREQUENCY TABLE 




CLASS 


0.0000000 


RANGE 


366.0583 


FREQUENCY 


1 


12 


2 


366.0583 


-- 


732.1167 


2 


3 


732.1167 


-- 


1098.175 


3 


4 


1098.175 


-- 


1464.233 


1 


5 


1464.233 


-- 


1830.292 


1 


6 


1830.292 


-- 


2196.350 





7 


2196.350 


-- 


2562.408 





8 


2562.408 


-- 


2928.467 





9 


2928.467 


-- 


3294.525 





10 


3294.525 


-- 


3660.583 





11 


3660.583 


-- 


4026.642 


1 


12 


4026.642 


-- 


4392.700 


1 



MEASURE OF GOODNESS OF FIT 
CHI SQUARE ■ 71.00000 



WI TH DEGREES OF FREEDOM 1 1 



ENTER THE NAME OF DATA FILE? STOP ;> 



MODULE NAME CTYPE "HELP" FOR AID)? 



The DSCRE program has terminated. 
The user may call another module or 
type STOP to return to XEXEC. 
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SECTION 5 

REGRESSION ANALYSES: 
THE REGON MODULE 



The regression analyses available in REGON are the multiple linear regression analysis, the 
stepwise multiple linear regression analysis, and the polynomial regression analysis. These 
analyses are performed by the programs DREGR, RGSTP, and RGPOL, respectively. These 
programs accept data from a file written in the standard format for STATPAK, described on 
page 4. 



The DREGR Program 

DREGR performs a multiple linear regression analysis between a dependent variable, y, and 
a set of independent variables, Xj,x 2 , . . . ,x m , based on a set of observations. A linear relation- 
ship of the form 

y = a + b lXl + b 2 x 2 + ••• + b m x m 

is established, where a is an intercept, and b ; is a regression coefficient. 

DREGR accepts up to 40 variables and 75 observations per variable. The number of obser- 
vations per variable should exceed the number of variables by at least two. 

When executing the program, the user enters the name of the data file, the dependent vari- 
able, and each of the independent variables. The list of independent variables is terminated by 
typing STOP, followed by a Carriage Return. DREGR then computes and prints the regression 
analysis statistics, an analysis of variance for the regression, and a table of residuals. The pro- 
gram then requests the next dependent variable. The user types STOP to exit the program. 

Example 

A multiple linear regression analysis is performed on the data in file DATA2, below. FACT6 
is the dependent variable. FACT1, FACT2, FACT3, FACT4, and FACT5 are the independent 
variables. 



- TYPE DATA2 D 



1 5 8 

FACT1 

29 30 30 30 35 35 

43 43 44 44 44 44 

44 44 45 
FACT2 

289 391 424 313 243 365 
396 356 346 156 278 349 
1 41 245 297 
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FACT3 

216 244 246 239 275 219 

267 274 255 258 249 252 

236 236 256 

FACT4 

85 92 90 91 95 95 

100 79 126 95 110 88 

129 97 111 

FACTS 

14 16 18 10 30 21 

39 19 56 28 42 21 

56 24 45 

FACT6 

1 2 2 2 2 

3 2 3 4 1 

1 1 3 

FACT7 

1 2 3 4 5 6 

7 8 9 10 11 12 

13 14 15 

FACT8 

10 16 20 23 25 26 

30 36 48 62 78 94 

107 118 127 



- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? REGONp 
PROGRAM NAME (TYPE "HELP" FOR AID)? DREGRp 
ENTER NAME OF DATA FILE? DATA2 ? 

NAME OF DEPENDENT VARI ABLE? FACT 6 ? 

ENTER NAMES OF INDEPENDENT VARI ABLES* ONE TO A ROW IN RESPONSE TO ?. 
TYPE STOP WHEN DONE. 
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?FACTlp 
? FACT2 p 
? FACT3 -p 
? FACT4 p 
?FACT5p 
? STOP d 



VARIABLE 


MEAN 


STANDARD 


CORRELATION 


REGRESSION 


STD. ERROR 


CAL 


NAME 




DEVIATION 


X 


vs y 


COEFFICIENT 


OF REG.COEF 


T VAL 


FACT1 


38.933 


6.508 




0.266 


-0.035 


0.053 


-0.655 


FACT2 


305.933 


83.365 




0.412 


0.008 


0.003 


2.9 58 


FACT3 


248.133 


17.423 




0.309 


-0.006 


0.017 


-0.368 


FACT4 


98.867 


14.287 




0.360 


-0.073 


0.060 


-1.216 


FACT5 


29.267 


14.911 




0.505 


0.133 


0.065 


2.032 


DEPENDENT 
















FACT6 


1.800 


1.146 













INTERCEPT 5.525384 

MULTIPLE CORRELATION 0.8427965 

STD. ERROR OF ESTIMATE 0.769 58 65 



ANALYSIS OF VARIANCE FOR THE REGRESSION 



SOURCE OF VARIATION 
DUE TO REGRESSION 



DEGREES SUM OF 

OF FREEDOM SQUARES 

5 13.06963 



DEVIATION FROM REGRESSION 9 

TOTAL 14 



5.330370 
18.40000 



MEAN 
SQUARES 
2.613926 

0.5922633 



F VAL 
4.413453 



NEED TABLE OF RESIDUALS.? 

TYPE TTY, DSK,BOTH OR NONE. 

BOTH p The table of residuals is printed on the terminal and on a file. 
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MULTIPLE REGRESSION. 



.Y= FACT6 



TABLE OF RESIDUALS 



CASE NO. 


Y VALUE Y ESTIMATE 


RESIDUAL 


1 


1.000000 1.196935 


-0.1969345 


2 


2.000000 1.589456 


0.4105439 


3 


2.000000 2.264102 


-0.2641021 


4 


0.0000000 0.2468169 


-0.2468169 


5 


2.000000 1.621938 


0.3780617 


6 


2.000000 1.800457 


0.1995433 


7 


3.000000 3.499589 


-0.499 5890 


8 


2.000000 2.003885 


-0.3884822E-02 


9 


3.000000 3.476657 


-0.4766569 


10 


0.0000000 0.4241671 


-0.4241671 


11 


4.000000 2.259888 


1 . 740 1 1 2 


12 


1.000000 1.656622 


-0.6566221 


13 


1.000000 1.666003 


-0.6660032 


14 


1.000000 0.6291739 


0.3708261 


15 


3.000000 2.664310 


0.3356900 


ENTER OUTPUT FILENAME. ?££§£;) 


The file RES2. DA T contains the residuals. 


NAME OF 


RESIDUAL ARRAY? .ARRSp 


The residual array on the file is named AR 


NAME OF 


DEPENDENT VARIABLE?STOP^> 





MODULE NAME (TYPE "HELP" FOR AID)? 



The contents of RES2.DAT are shown below. 
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ARR2 
-0.1969345 

0.1995433 
-0.499 5890 

-0.6566221 
-0.6660032 



1 
RESIDUAL DATA 
0.4105439 -0.2641021 

-0.3884822E-02 -0.4766569 

0.3708261 0.3356900 



FACT 6 
-0.2468169 

-0.4241671 



0.3780617 
1 . 740 1 1 2 
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The RGSTP Program 

This program performs a stepwise multiple linear regression analysis between a dependent 
variable, y, and a set of independent variables, x 1 ,x 2 , . . . ,x m , based on a set of observations. 
A linear relationship of the form 



y = a + b l x l + b 2 x 2 + ••• + b m x m 

is established, where a is an intercept, and b; is a regression coefficient. 

RGSTP accepts up to 40 variables and 75 observations per variable. The number of observa- 
tions per variable should exceed the number of variables by at least three. 

RGSTP computes and prints the following: 

• The mean and standard deviation for each variable (optional). 

• A correlation matrix (optional). 

• The regression analysis as each variable is entered. 

• A table of residuals (optional). 

When executing RGSTP, the user enters the name of the data file, the dependent variable, 
and the independent variables with the appropriate codes. The code signifies how the inde- 
pendent variable is to be used in the regression. The codes are: 

Code Meaning 

The variable is free to enter or leave the regression. 

1 The variable is forced into the regression regardless of the entry criterion 
described below. 

2 The variable is forced out of the regression regardless of the entry criterion. 

To terminate the list of independent variables and codes, the user types STOP, followed by 
a Carriage Return. 

RGSTP requests an entry criterion, PCT, which is the proportion of the sum of squares of 
the dependent variable. To be entered into the regression, a variable must account for a pro- 
portion of variability at least as large as PCT. The value of PCT should be between and 1, 
inclusive. The user who has no special choice of this criterion may use the value 0. 

The user may perform as many analyses as he chooses and may terminate the program by 
typing STOP in response to: 
NAME OF DEPENDENT VARIABLE? 
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Example 

The data below corresponds to 20 lots on a tract of land. The variables are AREA in square 
feet, ELEVN (elevation) in feet above sea level, SLOPE in degrees, VIEW on a scale of 1 to 9 
ranging from poor to excellent, and PRICE in thousands of dollars. The data is stored on a 
file named FILE1. 



- TYPE FILEl p 



20 5 
AREA 

14.7 14.2 12.7 13.6 14.4 17.4 

21.8 14 17.5 23 18.3 19.4 
15.2 18.3 21.7 16.7 13.6 14*5 
12.1 17*4 

ELEVN 

155 1 55 1 58 1 58 155 1 57 

172 170 175 185 185 205 

215 195 178 160 205 190 

203 125 

SLOPE 

1.5 1.8 2.9 1 .5 1 

5.7 5.4 17.5 14.5 14.4 12.2 
5 13.1 15.2 10.1 7.4 5.8 
5.1 17.3 

VIEW 

2 2 1 12 2 

4 6 9 9 9 9 

8 6 8 8 7 7 

7 1 

PRICE 

4.1 3.9 3.2 2.9 3*9 4.1 

5.8 5.1 6.8 6.8 6.5 7 
5.8 5.1 5.3 4.9 6 5.3 
4.8 4.3 



The user performs a stepwise multiple linear regression with PRICE as the dependent variable, 
and AREA, ELEVN, VIEW, and SLOPE as independent variables. The entry criterion is set to 
zero. 

- R STATPAK -p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME C TYPE "HELP" FOR AID)? REGON ? 
PROGRAM NAME (TYPE "HELP" FOR AID)? RGSTP ? 
ENTER NAME OF DATA FILE.?FI_LELp 
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NAME OF DEPENDENT VARIABLE. ? PRICE p 

START ENTERING INDEPENDENT VARIABLES AND CODES NOW. 

CODE MEANS A VARIABLE IS FREE TO ENTER OR LEAVE REGRESSION 

CODE 1 MEANS A VARIABLE IS FORCED INTO REGRESSION 

CODE 2 MEANS A VARIABLE IS FORCED OUT OF REGRESSION. 

ENTER NAME OF INDEPENDENT VARIABLE.?AREAp 

ENTER CODE.? Op 

ENTER NAME OF INDEPENDENT VARIABLE.? EL EVN ^ 

ENTER CODE.?0_p 

ENTER NAME OF INDEPENDENT VARIABLE.? SLOPE p 

ENTER CODE.?0_p 

ENTER NAME OF INDEPENDENT VARIABLE.? VI EW p 

ENTER CODE.?0_p 

ENTER NAME OF INDEPENDENT VARI ABLE.?STOPp 

DO YOU WANT A PRINTOUT OF MEANS AND STD. DEVS.?YESp 

DO YOU WANT A PRINTOUT OF CORRELATION MATRIX. ?YESp 

NEED RESIDUAL TABLE?. 
TYPE TTY, DSK,BOTH OR NONE. 
BOTHp 

ENTER OUTPUT FILENAME? .RESip The residuals are printed on the 

file RESl.DATand the terminal. 
ENTER THE VALUE OF PCT#THE PROPORTION OF SUM OF 
SQUARES THAT SHOULD BE USED AS CRITERION IN ENTERING A 
VARIABLE INTO REGRESSION. VALUE SHOULD BE BETWEEN AND 1. 
Op 

NUMBER OF OBSERVATIONS 20 

NUMBER OF VARIABLES 5 

CONSTANT TO LIMIT VARIABLES 0.00000 



VARIABLE MEAN STANDARD 

NAME DEVIATION 



PRICE 5.080000 1.198947 

AREA 16.53500 3.156326 

ELEVN 175.0500 22.80230 

SLOPE 7.870000 5.871976 

VIEW 5.400000 3.135535 
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CORRELATION MATRIX 
ROW PRICE 
1.000000 



ROW AREA 
0.5775156 

ROW ELEVN 
0.6445836 

ROW SLOPE 
0.6644410 

ROW VI EW 
0.8786537 



0.5775156 

1.000000 



0.6445836 0.6644410 

0.7039703E-01 0.629 7176 



0.7039703E-01 1.000000 
0.6297176 0.1515843 

0.39 63015 0.749088 7 



0.1515843 



1.000000 



0.6075628 



0.8786537 
0.3963015 
0.749088 7 
0.6075628 
1.000000 



DEPENDENT VARIABLE PRICE 

NUMBER OF VARIABLES FORCED 

NUMBER OF VARIABLES DELETED 



STEP 1 

VARIABLE ENTERED 

SUM OF SQUARES REDUCED IN THIS STEP 

PROPORTION REDUCED IN THIS STEP 

CUMULATIVE SUM OF SQUARES REDUCED 

CUMULATIVE PROPORTION REDUCED 

FOR 1 VARIABLES ENTERED 
MULTIPLE CORRELATION COEFFICIENT 

(ADJUSTED FOR D.F.) 
F VALUE FOR ANALYSIS OF VARIANCE 
STANDARD ERROR OF ESTIMATE 

(ADJUSTED FOR D.F.) 



VIEW 
21.08575 

0.7720323 
21.08575 

0.7720323 



VARIABLE 

NAME 

VIEW 



REGRESSION 
COEFFICIENT 
0.3359743 



0.8786537 
0.8786537 
60.9 58 56 
0.5881352 
0.5881352 



OF 



OF 



27.31200 



STD ERROR 
REG COEFF 
0.4303172E-01 



COMPUTED 
T- VALUE 
7.807596 



INTERCEPT 



3.265739 
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STEP 2 

VARIABLE ENTERED 

SUM OF SQUARES REDUCED IN THIS STEP 

PROPORTION REDUCED IN THIS STEP 

CUMULATIVE SUM OF SQUARES REDUCED 

CUMULATIVE PROPORTION REDUCED 



AREA 

1.703636 
0.6237634E-01 

22.7893k? 
0.8 344092 



OF 



27.31200 



FOR 2 VARIABLES ENTERED 
MULTIPLE CORRELATION COEFFICIENT 

(ADJUSTED FOR D.F.) 
F VALUE FOR ANALYSIS OF VARIANCE 
STANDARD ERROR OF ESTIMATE 

(ADJUSTED FOR D.F.) 



0.9134600 
0.9084105 
42.83134 
0.5157871 
0.5299208 



VARIABLE 


REGRESSION 


NAME 


COEFFICIENT 


VIEW 


0.2947526 


AREA 


0.1033309 


INTERCEPT 


1.779 760 



STD ERROR OF 
REG COEFF 
0.4110384E-01 
0.408 3308 E-01 



COMPUTED 
T- VALUE 
7.170926 
2.530 568 



STEP 3 

VARIABLE ENTERED 

SUM OF SQUARES REDUCED IN THIS STEP 

PROPORTION REDUCED IN THIS STEP 

CUMULATIVE SUM OF SQUARES REDUCED 

CUMULATIVE PROPORTION REDUCED 



ELEVN 
0.1664590 
0.6094720E-02 

22.95584 
0.8405039 



OF 



27.31200 



FOR 3 VARIABLES ENTERED 
MULTIPLE CORRELATION COEFFICIENT 

(ADJUSTED FOR D.F.) 
F VALUE FOR ANALYSIS OF VARIANCE 
STANDARD ERROR OF ESTIMATE 

(ADJUSTED FOR D.F.) 



0.9167900 
0.9064986 
28.10531 
0.52178 53 
0.5516253 



VARIABLE 


REGRESSION 


NAME 


COEFFICIENT 


VIEW 


0.2532135 


AREA 


0.1162890 


ELEVN 


0.6676355E-02 


INTERCEPT 


0.6211119 



STD ERROR OF 
REG COEFF 
0.6746307E-01 
0.4450826E-01 
0.8 53842 1E-02 



COMPUTED 
T- VALUE 
3.753365 
2.612752 
0.78 19191 
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STEP 4 

VARIABLE ENTERED 

SUM OF SQUARES REDUCED IN THIS STEP 

PROPORTION REDUCED IN THIS STEP 

CUMULATIVE SUM OF SQUARES REDUCED 

CUMULATIVE PROPORTION REDUCED 



SLOPE 
0.19758 52 
0.7234373E-02 

23.15343 
0.8477383 



OF 



27.31200 



FOR 4 VARIABLES ENTERED 

MULTIPLE CORRELATION COEFFICIENT 0.9207270 

(ADJUSTED FOR D.F.) 0.9050907 

F VALUE FOR ANALYSIS OF VARIANCE 20.87864 

STANDARD ERROR OF ESTIMATE 0.5265341 

(ADJUSTED FOR D.F.) 0.5737773 



VARIABLE 


REGRESSION 


NAME 


COEFFICIENT 


VIEW 


0.2048664 


AREA 


0.9873099E-01 


ELEVN 


0.1067605E-01 


SLOPE 


0.2949830E-01 


INTERCEPT 


0.2402108 



STD ERROR OF 
REG COEFF 
0.8896190E-01 
0.4949519E-01 
0.9832821E-02 



COMPUTED 
T-VALUE 
2.3028 56 
1.994759 
1.085756 



0.3494189E-01 0.8442104 



TABLE OF RESIDUALS 



CASE NO. 


Y VALUE 


Y ESTIMATE 


RESIDUAL 




1 


4.100000 


3.800324 


0.2996761 




2 


3.900000 


3.759808 


0.1401921 




3 


3.200000 


3.471321 


-0.2713212 




4 


2.900000 


3.5238 79 


-0.6238786 




5 


3.900000 


3.741206 


0.158 7937 




6 


4.100000 


4.073500 


0.2649951E- 


•01 


7 


5.800000 


5.216432 


0.5835675 




8 


5.100000 


4.8258 62 


0.2741379 




9 


6.800000 


6.196330 


0.6036704 




10 


6.800000 


6.757616 


0.4238433E- 


•01 


11 


6.500000 


6.290630 


0.2093699 




12 


7.000000 


6.5478 59 


0.4521412 




13 


5.800000 


5.822695 


-0.2269495E-01 


14 


5.100000 


5.744443 


-0.6444435 




15 


5.300000 


6.370315 


-1.070315 




16 


4.900000 


5.5340 50 


-0.6340502 




17 


6.000000 


5.423894 


0.5761057 




18 


5.300000 


5.305414 


-0.5414307E-02 


19 


4.800000 


5.186600 


-0.3865997 




20 


4.300000 


4.007823 


0.2921771 





NAME OF RESIDUAL ARRAY? .ARRlp 



NAME OF DEPENDENT VARIABLE. ? STOP -p 



MODULE NAME (TYPE "HELP" FOR AID)? 
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The RGPOL Program 

RGPOL performs a polynomial regression analysis between a dependent variable, y, and an 
independent variable, x, based on a set of observations. A polynomial relationship of the form 

y = a + bjx + b 2 x 2 + • • • + b k x k 

is established, where a is an intercept, and b k is a regression coefficient. 

RGPOL accepts up to 75 observations per variable. The number of observations per variable 
should exceed the degree of the polynomial by at least two. 

The analysis begins with a first degree polynomial of the form 

y = a + b x x 

then proceeds to the next degree polynomial, 

y = a + bjx + b 2 x 2 

and continues to the polynomial 

y = a + bjx + b 2 x 2 + •■• + b k x k 

where k is the degree specified by the user, between 1 and 10, inclusive. At each step, RGPOL 
checks on the improvement in fit caused by going to a higher degree polynomial. If there is no 
improvement, higher degree fits are ignored. 

RGPOL allows a lead/lag analysis of data. This analysis is particularly useful with a first 
degree polynomial of the form: 

y = ax + b 

The user can analyze an independent variable whose effect on the dependent variable leads or 
lags the corresponding observations. 

Example 1 

This example demonstrates the lead/lag feature of RGPOL. The user has a 12-month table 
of advertising and sales. He believes advertising affects sales with a two-month lag. The data, 
in thousands of dollars, is stored on the file MKTG. 

- TYPE MKTG d 



12 2 

ADVTG 

10 12 11 11 10 12 

12 12 10 10 12 12 

SALES 

200 210 205 222 217 216 

198 205 210 211 207 202 
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- R STATPAK ^ 

TYMSHARE PDP-10 STATPAK VERSION 3.0i (11/9/71) 

MODULE NAME (TYPE "HELP" FOR AID)? REGON d 

PROGRAM NAME (TYPE "HELP" FOR AID)? RGPOL p 

ENTER THE NAME OF DATA FILE.? MKTG ? 

NAME OF DEPENDENT VARIABLE.? SALES? 

NAME OF INDEPENDENT VARI ABL E . ? ADyTG? 

HIGHEST DEGREE OF POLYNOMIAL TO BE TRIED.?!? The user assumes a 

linear relationship. 

IF THE DEPENDENT VARIABLE LEADS/LAGS THE INDEPENDENT 
VARIABLE ENTER PERIODS AS A POSITIVE/NEGATIVE NUMBER. 
OTHERWISE TYPE 0. 
2.2 



NUMBER OF OBSERVATIONS 



1 The number of observations is 10 instead 

of 12 because there is a lead of 2 periods. 



POLYNOMIAL REGRESSION OF DEGREE 1 



INTERCEPT 



160.0000 



REGRESSION COEFFICIENTS 
4.500000 



ANALYSIS OF VARIANCE FOR 1 DEGREE POLYNOMIAL 



SOURCE OF VARIATION D.F. SUM OF 

SQUQRES 

DUE TO REGRESSION 1 162.0000 
DEV FROM REGRESSN 8 360.5000 



MEAN 
SQUARE 

162.0000 
45.06250 



F IMPROVEMENT 

VAL IN SS 



3.60 



162.000 



TOTAL 9 522.5000 

STANDARD ERROR OF FORECAST 45.0 62 

NEED RESIDUAL TABLE?. 

TYPE TTY#DSK*BOTH OR NONE. 

BOTH 7) The user requests that the residual table be printed on the terminal and written on a file. 



OUTPUT FILENAME? . MRES -p 



The residuals are written on file MRES.DA T. 
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POLYNOMIAL REGRESSION X= ADVTG Y= SALES 



POLYNOMIAL REGRESSION OF DEGREE 1 



TABLE OF RESIDUALS 



OBSERVATION 


X VALUE 


Y VALUE 


Y ESTIMATE 


RESIDUAL 


1 


10.00000 


205.0000 


205.0000 


0.0000000 


2 


12.00000 


222.0000 


214.0000 


8.000000 


3 


11.00000 


217.0000 


209.5000 


7.500000 


4 


11.00000 


218.0000 


209.5000 


8 . 500000 


5 


10.00000 


198.0000 


205.0000 


-7.000000 


6 


12.00000 


205.0000 


214.0000 


-9.000000 


7 


12.00000 


210.0000 


214.0000 


-4.000000 


8 


12.00000 


211.0000 


214.0000 


-3.000000 


9 


10.00000 


207.0000 


205.0000 


2.000000 


10 


10.00000 


202.0000 


205.0000 


-3.000000 


RESIDUAL ARRAY NAME?.RESID T 


) 







MODULE NAME (TYPE "HELP" FOR AID)? STOFp 



EXECUTION TIME* 2 MIN. 0.09 SEC 

TOTAL ELAPSED TIME* 4 MIN. 22.43 SEC. 

NO EXECUTION ERRORS DETECTED 

EXIT 



- TYPE MRES.DAT t) 



The user prints the file of residuals. 
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RESID 

0.0000000 

-9.000000 
-4.000000 



1 



8.000000 
-3.000000 



RESIDUAL DATA SALES 

7.500000 8.500000 



2.000000 



-3.000000 



-7.000000 
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Example 2 



The user performs a polynomial regression analysis of the fourth degree on the variables 
FACT8 and FACT7 of data file DATA2. 



- TYPE DATA2 T) 



15 8 








FACT1 








29 30 30 30 


35 


35 




43 43 44 44 


44 


44 




44 44 45 








FACT2 








289 391 424 


313 243 


365 


396 356 346 


156 278 


349 


141 245 297 








FACT3 








216 244 246 


23S 


> 275 


219 


267 274 255 


258 249 


252 


236 236 256 








FACT4 








85 92 90 91 


95 


95 




100 79 126 95 110 8€ 


i 


129 97 111 








FACT 5 








14 16 18 10 


30 


21 




39 19 56 28 


42 


21 




56 24 45 








FACT6 








1 2 2 2 2 








3 2 3 4 1 








1 1 3 








FACT7 








1 2 3 4 5 6 








7 8 9 10 11 


12 






13 14 15 








FACT8 








10 16 20 23 


25 


26 




30 36 48 62 


78 


94 




107 118 127 
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- R STATPAK d 

TYMSHARE PDP-10 STATPAK VERSION 3.01 < 11/9/71) 

MODULE NAME (TYPE "HELP" FOR AID)? REGON p 

PROGRAM NAME (TYPE "HELP" FOR AID)? RGPOL p 

ENTER THE NAME OF DATA FILE.?DATA2^ 

NAME OF DEPENDENT VARI ABL E . ? FAC T8 ? 

NAME OF INDEPENDENT VARI ABL E . ? FAC T7 ? 

HIGHEST DEGREE OF POLYNOMIAL TO BE TRIED.?4_2 

IF THE DEPENDENT VARIABLE LEADS/LAGS THE INDEPENDENT 
VARIABLE ENTER PERIODS AS A POSITIVE/NEGATIVE NUMBER. 
OTHERWISE TYPE 0. 
^ There is no lead or lag. 



NUMBER OF OBSERVATIONS 



15 



POLYNOMIAL REGRESSION OF DEGREE 1 



INTERCEPT -13.87619 

REGRESSION COEFFICIENTS 
8.567857 



ANALYSIS OF VARIANCE FOR 



SOURCE OF VARIATION D.F. SUM OF 

SQUQRES 



DUE TO REGRESSION 1 
DEV FROM REGRESSN 13 

TOTAL 14 



20554.29 
1971.045 

22525.33 



MEAN 
SQUARE 

20554.29 
151.6188 



1 DEGREE POLYNOMIAL 

F IMPROVEMENT 

VAL IN SS 

135.57 20554.289 



STANDARD ERROR OF FORECAST 



151.619 
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POLYNOMIAL REGRESSION OF DEGREE 2 



INTERCEPT 



15.07252 



REGRESSION COEFFICIENTS 
-1.649337 0.6385746 



ANALYSIS OF VARIANCE FOR 



DEGREE POLYNOMIAL 



DUE TO REGRESSION 
DEV FROM REGRESSN 



TOTAL 14 22525.33 

STANDARD ERROR OF FORECAST 24.069 



D.F. 


SUM OF 


MEAN 


F IMPROVEMENT 




SQUQRES 


SQUARE 


VAL IN SS 


2 


22236.51 


11118.25 


461.94 1682.219 


12 


288.8257 


24.06881 





POLYNOMIAL REGRESSION OF DEGREE 3 



INTERCEPT 



18.52399 



REGRESSION COEFFICIENTS 
-3.885354 0.9769363 



-0.14098 50E-01 



ANALYSIS OF VARIANCE FOR 



DEGREE POLYNOMIAL 



DUE TO REGRESSION 
DEV FROM REGRESSN 



TOTAL 14 22525.33 

STANDARD ERROR OF FORECAST 25.230 



D.F. 


SUM OF 


MEAN 


F IMPROVEMENT 




SQUQRES 


SQUARE 


VAL IN SS 


3 


22247.81 


7415.936 


293.94 11.300 


11 


277.5259 


25.22963 
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POLYNOMIAL REGRESSION OF DEGREE 4 



INTERCEPT -6.042034 

REGRESSION COEFFICIENTS 
19.98636 -5.247136 



0.5774151 



-0.1848480E-01 



ANALYSIS OF VARIANCE FOR 4 DEGREE POLYNOMIAL 



SOURCE OF VARIATION D.F. SUM OF 

SQUQRES 



DUE TO REGRESSION 
DEV FROM REGRESSN 



4 22508.35 
10 16.98535 



TOTAL 14 22525.33 

STANDARD ERROR OF FORECAST 

NEED RESIDUAL TABLE?. 
TYPE TTY*DSK*BOTH OR NONE. 
TTY 



1.699 



MEAN 
SQUARE 

5627.087 
1.698535 



F IMPROVEMENT 

VAL IN SS 

3312.91 260.541 



POLYNOMIAL REGRESSION X= FACT7 Y= FACTS 



POLYNOMIAL REGRESSION OF DEGREE 4 



TABLE OF RESIDUALS 



OBSERVATION 


X VALUE 


Y VALUE 


Y ESTIMATE 


RES I DUAL 


I 


1.000000 


10.00000 


9.256118 


0.7438824 


2 


2.000000 


16.00000 


17.26570 


-1.265702 


3 


3.000000 


20.00000 


20.78576 


-0.78 57559 


4 


4.000000 


23.00000 


22.17168 


0.8283164 


5 


5.000000 


25.00000 


23.33525 


1.664749 


6 


6.000000 


26.00000 


25.74459 


0.2554083 


7 


7.000000 


30.00000 


30.42420 


-0.4241996 


8 


8.000000 


36.00000 


37.9 5494 


-1.9 54942 


9 


9.000000 


48.00000 


48.47404 


-0.4740391 


10 


10.00000 


62.00000 


61.67508 


0.3249168 


11 


11.00000 


78.00000 


76.80804 


1.191956 


12 


12.00000 


94.00000 


92.67921 


1.320786 


13 


13.00000 


107.0000 


107.6513 


-0.6513062 


14 


14.00000 


118.0000 


119.6433 


-1.643349 


15 


15.00000 


127.0000 


126.1308 


0.8 692398 



MODULE NAME (TYPE "HELP" FOR AID)? 
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SECTION 6 

PARAMETRIC HYPOTHESIS TESTING: 
THE PARH MODULE 



The PARH module consists of the programs TSTAT and FSTAT. TSTAT computes the 
student's t-statistic under four different hypotheses. FSTAT computes the F-ratio and the 
corresponding degrees of freedom. 

Both programs accept data from the terminal or from one or more data files written in the 
standard format for STATPAK, described on page 4. FSTAT and TSTAT accept up to 100 
observations per variable. 



The TSTAT Program 

TSTAT calculates the student's t-statistic and corresponding degrees of freedom for a set of 
data. The £-test is valid for normally distributed parent populations. TSTAT can test any of 
the hypotheses below with the f-statistic. 

Code Hypothesis 

1 The sample observations corresponding to the variable come from a population with 
a specified mean. 

2 The sample observations corresponding to two variables come from populations with 
the same mean, assuming the populations have equal variances. 

NOTE: The validity of this assumption can be checked by computing the F-ratio in 
FSTAT. 

3 The same as hypothesis 2, but assuming the populations have unequal variances. 

4 The same as hypothesis 2, but assuming there is no information on the population 
variances. 

NOTE: The two variables must have the same number of observations to test this 
hypothesis. 
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Example 1 

The heights, in inches, of ten individuals chosen at random from a normal population are: 
63, 63, 66, 67, 68, 69, 70, 70, 71, 71 
The user tests the hypothesis that the mean height in this population is 66 inches. 

- R STATPAK t^ 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? PARH p 

PROGRAM NAME (TYPE "HELP" FOR AID)? TSTAT p 

DO YOU WANT INSTRUCTIONS.?YESp 

CODE 1 TEST MEAN=SPECIFIED VALUE 

CODE 2 TEST MEAN A=MEAN B POPULATION VARIANCES EQUAL 

CODE 3 TEST MEAN A=MEAN B POPULATION VARIANCES UNEQUAL 

CODE 4 TEST MEAN A=MEAN B NO ASSUMPTION ON VARIANCES. 

ENTER CODE OF HYPOTHESIS TO BE TESTED. ?Ip 

DO YOU WANT TO ENTER THE VALUES ON-LINE.?YESp 

VARIABLE NAME.?HGTp 

ENTER NO OF OBSERVATIONS.?!^ 

ENTER THE 10 VALUES NOT EXCEEDING 6 IN A ROW 
63 63 66 67 68 69 ^ 
70 70 71 71 ^ 

ENTER VALUE OF POPULATION MEAN.? 66 p The user tests the hypothesis that 

the population mean is 66 inches. 

MEAN OF VARIABLE HGT 67.80000 



THE COMPUTED T VALUE IS 1.890378 

WITH DEGREES OF FREEDOM 9 



MODULE NAME (TYPE "HELP" FOR AID)? 



From t-test tables, the user can see that his hypothesis is true at a 5% level of significance. 
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Example 2 

A test population was fed on diet A during a certain period. A random sample of ten members 
of this population shows the following increase in weight: 

10, 6, 16, 17, 13, 12, 8, 14, 5, and 9 

Another sample fed on diet B shows the following increase in weight for the same period: 
7, 13, 22, 15, 12, 14, 18, 8, 21, and 23 

The user tests whether diets A and B significantly affect the increase in weight of the popu- 
lations, assuming the population variances are equal. 

The data for diet A and diet B is stored on file TDATA. 
- TYPE TDATA ? 



10 2 

DIETA 

10 6 16 17 13 12 

8 14 5 9 

DIETB 

7 13 22 15 12 14 

18 8 21 23 



- R STATPAK d 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? PARH p 

PROGRAM NAME (TYPE "HELP" FOR AID)? TSTAT o 

DO YOU WANT INSTRUCTIONS.?!*^ 

ENTER CODE OF HYPOTHESIS TO BE TESTED. ?2p The user wishes to test that the 

population means are equal, 
DO YOU WANT TO ENTER THE VALUES ON-LINE. ?N0^ assuming equal variances. 

ENTER NAME OF DATA FILE.? TDATA ? 

VARIABLE NAME.?PJ_ETA-> 

ENTER NAME OF DATA FILE.? TDATA p 

VARIABLE NAME.?DIETBj? 
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MEAN OF VARIABLE DI ETA 11.00000 

MEAN OF VARIABLE DIETB 15.30000 



THE COMPUTED T VALUE IS 1.9 57919 

WITH DEGREES OF FREEDOM 18 



MODULE NAME (TYPE "HELP" FOR AID)? 



From t-test tables, the user can see that the two samples come from populations with the 
same mean at a 5% level of significance. 



The FSTAT Program 

FSTAT uses random samples from two normal populations to test the equality of the vari- 
ances of those populations. The program computes the F-ratio and the corresponding degrees 
of freedom. 

Example 

Two random samples taken from two normal populations are shown below: 



Sample 1 


Sample 2 


20 


27 


16 


33 


26 


42 


27 


35 


23 


32 


22 


34 


18 


38 


24 


28 


25 


41 


19 


43 




30 
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The user calls FSTAT to obtain estimates of the variances of the populations and test the 
hypothesis that the two populations have the same variance. 
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- R STATPAKp 

TYMSHARE PDP-10 STATPAK VERSION 3.01 C 11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? PARH : p 

PROGRAM NAME (TYPE "HELP" FOR AID)? FSTAT^ 

DO YOU WANT TO ENTER THE VALUES ON-LINE. ?YES-p 

VARIABLE NAME.? SAMP l p 

ENTER NO OF OBSERVATIONS. ?10p 

ENTER THE 10 VALUES NOT EXCEEDING 6 IN A ROW 
20 16 26 27 23 22 ^ 
18 24 25 19^ 

VARIABLE NAME.? SAMP2 -p 

ENTER NO OF OBSERVATIONS.? 12;p 

ENTER THE 12 VALUES NOT EXCEEDING 6 IN A ROW 
27 33 42 35 32 34 ^ 
38 28 41 43 30 37 p 

VARIANCE OF VARIABLE SAMP1 13.33333 

VARIANCE OF VARIABLE SAMP2 28.54545 



THE COMPUTED F VALUE IS 2.140909 

WITH DEGREES OF FREEDOM 11 AND 9 



MODULE NAME (TYPE "HELP" FOR AID)? 



The calculated F-ratio is 2.1409, and the 5% value of F with 11 and 9 degrees of freedom is 
3.1, according to the F-test tables. Thus, the user may conclude that the population variances 
are equal at a 5% level of significance. 
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SECTION 7 

NON-PARAMETRIC HYPOTHESIS TESTING: 
THE NPARH MODULE 



This module consists of programs UTEST, CRANK, and CCORD. These programs accept data 
from the terminal or from one or more data files written in the standard format for STATPAK. 
All three programs accept a maximum of 100 observations per variable. 



The UTEST Program 

UTEST is used to determine whether two independent samples come from the same popula- 
tion. UTEST prints the Mann-Whitney (/-statistic and a significance measure of this statistic. 

NOTE: When two populations cannot be assumed normal and homogeneous, UTEST can be 
used as a non-parametric test equivalent to the parametric T-test. 

Example 

The figures below represent half-yearly sales in thousands of dollars for ABC Company and 
XYZ Company. 



Company 


XYZ Company 


21.2 


19.3 


23.4 


22.5 


25.6 


26.2 


24.8 


24.7 


27.5 


26.2 


26.2 


27.5 


29.3 


28.7 


29.4 


27.8 


28.7 


29.2 


26.9 


27.8 


27.8 


26.4 


28.6 


25.7 


29.2 


26.8 


29.3 


28.7 


28.7 


29.3 


28.8 


28.6 




29.1 




29.2 




28.9 




29.2 
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The user tests the hypothesis that the sales of the two companies are not significantly different. 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? NPARKp 
PROGRAM NAME (TYPE "HELP" FOR AID)? UTEST p 
DO YOU WANT TO ENTER DATA ON-LINE. ?YESp 
ENTER SIZE OF SAMPLE 1 AND SAMPLE 2.? 16 20 p 
ENTER NAME OF VARI ABLE.?AB£COp 

ENTER THE 16 VALUES OF ABCCO NOT EXCEEDING 6 IN A ROW. 

21.2 23.4 25.6 24.8 27.5 26.2 ? 

29.3 29.4 28.7 26.9 27.8 28.6 71 

29.2 29.3 28.7 28.87 ) 

ENTER NAME OF VARI ABLE. ? XYZC0 t> 

ENTER THE 20 VALUES OF XYZCO NOT EXCEEDING 6 IN A ROW. 

19.3 22.5 26.2 24.7 26.2 27.5 ^ 

28.7 27.8 29.2 27.8 26.4 25. 7 -p 

26.8 28.7 29.3 28.6 29.1 29. g j 

28.9 29.2 71 

MANN- WHITNEY U-STATISTIC IS 151.5000 

MEASURE OF SIGNIFICANCE ON U IS -0.2711977 



MODULE NAME (TYPE "HELP" FOR AID)? 



The measure of significance of the ^/-statistic is -0.2712. This measure is a standard normal 
deviate. The 5% value of a standard normal deviate table is 1.96. Thus, there is no evidence 
that the two samples come from different populations. 
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The CRANK Program 

CRANK computes Kendall's rank correlation coefficient to determine the degree of associa- 
tion between two variables. The program also prints a significance measure of the coefficient 
if there are at least ten observations per variable. 

NOTE: CRANK is useful in analyzing correlations when the populations are not normally 
distributed. 

The data for the variables may be ranked or not ranked. The user indicates this by entering 
a code of for unranked data and 1 for ranked data. 

Example 

The data below corresponds to the half-yearly sales of Acme Company, which manufactures 
razors, and Gilt Company, which manufactures blades. The data is stored on file CDATA. 



- TYPE CDATA ? 



16 2 

ACME 

261 264 260 265 279 28 5 

282 293 302 296 291 302 

314 311 324 322 

GILT 

31 42 45 54 54 64 

61 54 61 72 84 85 

101 114 107 127 



The user calls CRANK to determine whether there is a significant relation between the sales 
of the two companies. 



-R_STATPAKp 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? NPARH ? 
PROGRAM NAME (TYPE "HELP" FOR AID)? CRANK} 
DO YOU WANT TO ENTER DATA ON-LINE. ?N0} 
ENTER NAME OF DATA FILE.? CDATA ^ 
ENTER NAME OF VARIABLE. ?ACME_p 
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ENTER NAME OF DATA FILE.? CDATA ^ 

ENTER NAME OF VARIABLE.? GILT;) 

ENTER VALUE OF CODE. 

CODE ■ 1 IF DATA IS RANKED. 

CODE = IF DATA IS NOT RANKED. 

KENDALLS RANK CORRELATION COEFF. 0.7830425 
MEASURE OF SIGNIFICANCE IS 4.230545 



MODULE NAME (TYPE "HELP" FOR AID)? 



The measure of significance computed is 4.2305. This statistic is a standard normal deviate. 
The user compares the value with the 5% level of a standard normal deviate, which is 1.96. 
Thus, there is a significant correlation between the variables. 



The CCORD Program 

CCORD computes the degree of association among several variables based on a set of obser- 
vations. The observations may represent rankings of a particular set of data by the variables, 
or the data may be unranked. 

CCORD computes and prints the concordance coefficient, the chi-square statistic to measure 
the significance of the concordance coefficient, and the degrees of freedom. 

NOTE: The chi-square statistic is not computed if there are fewer than eight observations 
per variable. 

Example 

Four financial analysts, Adams, Evans, Jones, and Lane, ranked nine different investments as 
follows: 



Investment: 


1 


2 


3 


4 


5 


6 


7 


8 


9 


Adams: 


2 


3 


9 


4 


6 


1 


7 


5 


8 


Evans: 


9 


4 


1 


3 


5 


7 


8 


6 


2 


Jones: 


5 


4 


3 


2 


9 


7 


8 


6 


1 


Lane: 


6 


2 


1 


8 


4 


5 


7 


9 


3 
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This data is stored on file INVST, displayed below. 

- TYPE INVST -p 

9 4 

ADAMS 

2 3 9 4 6 1 

7 5 8 
EVANS 

9 4 13 5 7 

8 6 2 
JONES 

5 4 3 2 9 7 
8 6 1 

LANE 

6 2 18 4 5 

7 9 3 

The user calls CCORD to learn whether there is a significant degree of association among the 
analysts. 

- R STATPAK t) 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? NPARH -p, 

PROGRAM NAME (TYPE "HELP" FOR AID)? CCORD p 

ENTER NO. OF OBSERVATIONS AND NO. OF VARIABLES. ? 9 4 p Therearenine 

observations and 
DO YOU WANT TO ENTER DATA ON-LINE. ?NO;p four variables. 

IS THE ENTIRE DATA IN ONE FILE.? YES p 

ENTER NAME OF DATA FILE.? INVST ^> 

ENTER NAME OF VARIABLE. ? ADAMS p 

ENTER NAME OF VARI ABL E . ? EVANS^ 

ENTER NAME OF VARI ABL E . ? JONESp 

ENTER NAME OF VARIABLE.? LANE p 
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ENTER VALUE OF CODE. 

CODE = 1 FOR INPUT DATA RANKED 

CODE ■ FOR INPUT DATA NOT RANKED. 

l_p The user enters 1 to indicate that the data is ranked. 

CONCORDANCE COEFFICIENT IS 0.2979167 

COMPUTED CHI -SQUARE IS 9.533333 

WITH DEGREES OF FREEDOM 8 



MODULE NAME (TYPE "HELP" FOR AID)? 



The computed value of chi-square is 9.5333. The 5% level with eight degrees of freedom 
from the chi-square tables is 15.51. This is greater than the computed value, so the user con- 
cludes that there is no significant association between the analysts. 



53 



SECTION 8 

TIME SERIES ANALYSIS: 
THE TIMSA MODULE 



The TIMSA module consists of the forecasting programs TRIXP and XPOSE. TRIXP is recom- 
mended for short-term forecasting, while XPOSE is recommended for long-term forecasting with 
seasonal variations. Both programs accept data from a file written in the standard STATPAK 
format. 



The TRIXP Program 

This program forecasts future values of a variable based on a set of past observations. The 
forecast is made as a function of time only, and seasonal factors are not used in the calculations. 

TRIXP accepts a maximum of 500 observations from a data file and can forecast for up to 
50 periods into the future. 

When TRIXP is called, it requests a smoothing coefficient, Q, and smoothing constants, A, 
B, and C. These numbers are used to smooth the past data. Q is a number between and 1, 
inclusive. The role of Q in the formulas below demonstrates its effect. The user may enter 
initial values for the smoothing constants A, B, and C, or he may set them to and let TRIXP 
generate the initial values. 

TRIXP generates the initial values of the smoothing constants as follows: 



c 


= x l 


- 2x 2 + x 3 


B 


= x 2 


- X! - 1.5C 


A 


= x l 


- B - 0.5C 



where Xj (i = 1,2,3) are the first three observations of variable x. 

The initial values of the smoothing constants are used to smooth the data for one period 
ahead. Then the constants A, B, and C are updated, smoothing is done for the next period, 
and so on. 

TRIXP updates the smoothing constants as follows: 

A U pdated = x k + (1 - Q) 3 (S k - x k> 

B updated = B previous + C previous ~ *-5Q ( 2 ~ Q) ( S k ~ x k) 

^updated = ^previous Q w k - X k ) 

where x k is the k^ observation of variable x. 

S k is the smoothed value for the k 1 * 1 observation (S k = A + B + 0.5C). 
Q is the smoothing constant entered by the user. 
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The process of updating the smoothing constants is repeated until the entire series of past 
observations is exhausted. Then the final values for A, B, and C are used in forecasting. 

The forecast for t periods into the future is computed as follows: 

forecast = A + Bt + Ct 2 /2 

Example 

File SALES, below, contains two sets of observations representing the last 18 months of 
sales of product A and product B. 



- TYPE SALES p 



18 2 

PRO DA 

419 414 413 412 419 417 

422 430 438 441 447 450 

454 463 470 472 470 472 

PRODB 

220 222 220 230 234 248 

2 56 260 252 263 263 264 

266 252 254 248 249 260 



The user calls TRIXP to obtain sales forecasts for product B for the next three months. 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? TIMSA p 

PROGRAM NAME (TYPE "HELP" FOR AID)? TRIXP -p 

ENTER NAME OF DATA FILE.? SAL ES ? 

ENTER NAME OF VARI ABL E . ? PRO DB p 

ENTER THE VALUES OF THE SMOOTHING CONSTANTS A*B*C 
IF YOU HAVE NO SPECIAL CHOICE ENTER 0. 
O p The user allows the program to calculate A, B, and C. 

ENTER THE VALUE OF THE SMOOTHING COEFFICI ENT.? . 5? 

DO YOU WANT THE SMOOTHED SERIES PRINTED 0UT.?N0p 
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FORECAST FOR VARIABLE PRODB 

NO. OF PERIODS IN HISTORY 18 
SMOOTHING COEFFICIENT 0.5000000 

SMOOTHING CONS TANTSC UPDATED) 
A ■ 258.1592 

B = 5.391922 

C ■ 1.657501 

FOR HOW MANY PERIODS AHEAD YOU NEED FORECASTS. ?3_p 

FORECAST FOR PRODB 

PERIODS AHEAD FORECAST 



1 264.3799 

2 272.2581 

3 281.7937 



MODULE NAME (TYPE "HELP" FOR AID)? 



The XPOSE Program 

This program forecasts future values of a variable based either on a set of as many as 100 
past observations or on a set of forecasting parameters and smoothing coefficients supplied by 
the user. XPOSE uses the technique of exponentially weighted moving averages and incorporates 
linear trends and seasonal factors. 1 

When forecasting parameters are not supplied by the user, XPOSE generates them from past 
observations. XPOSE uses a user-specified number, H, of the earliest observations to generate 
starting values of the forecasting parameters. Using these parameters and a set of smoothing 
coefficients A, B, and C, with values between and 1, the program computes forecasts to be 
compared with the remaining past observations. 

The forecasts are made for one period ahead and compared with the observations for that 
period. Then XPOSE updates the parameters and makes a forecast for the next period. This 
process is repeated until the entire series of past observations is exhausted. The final values of 
the parameters are used to forecast future values of the variable. 

XPOSE uses a least squares method to compute the optimal smoothing coefficients. The 
forecasting technique is improved by minimizing the sum of squares of deviations, that is, 

2 (observed value - forecasted value) 2 

This sum is computed for all possible combinations of (A,B,C) in intervals of .1. The combi- 
nation (A,B,C) which minimizes the sum of squared deviations is termed the optimal set of 
smoothing coefficients. 



1 - This technique is documented in "Forecasting Sales by Exponentially Weighted Moving Averages" by Peter R. Winters, in Management 
Science, April 1960. 
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The length of a time period is defined by the user. He enters 4 to indicate quarterly periods, 
12 to indicate monthly periods, etc. 

The value of H, defined above, is limited as follows: H cannot exceed 60; it must be greater 
than the number of periods in a year, less than the number of periods in history, and must be a 
multiple of the number of periods in a year. Thus, if the user specifies that there are 12 periods 
in a year, H must be 24, 36, 48, or 60. 

XPOSE computes and prints the following: 

• Past observations with trend and seasonal factors, deseasonalized data, and forecasts 
(optional). 

• Forecasting parameters, including: 

SO the most recent smoothed and seasonally adjusted average. 

R the most recent estimate of the trend factor, that is, the rate of increase or 

decrease. 

A,B,C the smoothing constants, with values between and 1. 

F(t) seasonal factors for season t. 

• Forecasts of future values. 

• The standard error of the forecasts. 

Example 1 

File HIST, below, contains the sales history of two products for the last 36 months. 



-TYPE 


HISTp 


36 2 




PRO DA 




343.9 


366.3 


271.3 


256.7 


349.6 


380.1 



355.8 395.4 398.7 531.4 
301.1 377.8 354.8 376.1 
364.7 398.3 399.7 546.6 
276.2 253.3 334.5 337.6 374.6 390.2 
362.0 395.3 392.9 401 429.8 561.4 
303.5 270.2 349.2 380.9 419 409.9 
PR0DB 

101 102 103.5 105.4 104 104.5 
106 106 107.3 106.2 103 108.2 
109 109.2 109.3 109.7 107.6 108.7 
109.2 105.6 107 107.2 107.1 108 
102.9 104.5 106.2 107 106.5 108 
108 108.2 106.5 106.3 107.8 109 
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The user calls XPOSE to forecast the next year's sales for product A, variable PRODA. 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (H/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? TIMSAp 

PROGRAM NAME (TYPE "HELP" FOR AID)? XPOSE-p 

DO YOU HAVE THE PARAMETERS A*B* C* SO* R* FC J) 

READY TO BE USED IN FORECASTING OR DO YOU WANT TO COMPUTE 
THEM ? TYPE YES IF YOU HAVE THE PARAMETERS READY. 
N02 

ENTER THE NAME OF THE FILE.7 HIST ? 

ENTER NAME OF VARI ABL E . ? PRO DA p 

ENTER THE NAME OF YOUR COMPANY OR DIVISION 

NAME SHOULD NOT EXCEED 15 CHARACTERS. 
ACE PRODUCTS ^ 

NO. OF PERIODS IN HISTORY IS 36 
ENTER THE VALUES OF H AND L. 
H = NO OF PERIODS TO BE USED FOR GENERATING INITIAL 

VALUES OF TREND* AVERAGE AND SEASONALS 
L = SEASONS WITHIN A YEAR 

EXAMPLE- 12 FOR MONTHLY DATA* 4 FOR QUARTERLY DATA 
24 12 ^) The first 24 months of data generate initial values for the forecasting parameters. 

DO YOU WANT A PRINT OUT OF THE PAST SERIES 
WITH SEASONALS* TREND* DESEASONALISED DATA AND 
FORECASTS?. TYPE 1 FOR YES FOR NO 
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PERIOD SEASON AVERAGE TREND 

T J SO R 



SEASONAL 



F<J> 



ACTUAL 



S<T) 



FOREC BASED 

ON PREV PER 

SCT-1* 1) 



1 


1 


360.6679 


2 


2 


359.8491 


3 


3 


359.5432 


4 


4 


359.9543 


5 


5 


360.6169 


6 


6 


360.4105 


7 


7 


360.6503 


8 


8 


362.0906 


9 


9 


359.2623 


10 


10 


364.4666 


11 


11 


363.5589 


12 


12 


363.2893 


13 


1 


364.1371 


14 


2 


366.1461 


15 


3 


367.1574 


16 


4 


366.9498 


17 


5 


366.4347 


18 


6 


367.5246 


19 


7 


367.8867 


20 


8 


366.3605 


21 


9 


372.38 69 


22 


10 


364.8087 


23 


11 


367.9175 


24 


12 


369.7669 



0.4014 
0.1573 
0.0647 
0.1340 
0.2397 
0.1505 
0.1683 
0.4227 

•0.2275 
0.8589 
0.5056 
0.3505 
0.4500 
0.7618 
0.8117 
0.6078 
0.3833 
0.5246 
0.4921 
0.0884 
1.2760 

-0.4948 
0.2259 
0.5506 



0.9 549 

1.0207 

0.9906 

1.0976 

1.1043 

1.4759 

0.7521 

0.7070 

0.8444 

1.0249 

0.9798 

1.0370 

0.9 590 

1.0346 

0.9928 

1.0879 

1.0935 

1 . 48 50 

0.7510 

0.6945 

0.8875 

0.9453 

1.0105 

1.0516 



343.9000 
366.3000 
355.8000 
395.4000 
398.7000 
531.4000 
271.3000 
256.7000 
301.1000 
377.8000 
354.8000 
376.1000 
349 . 6000 
380.1000 
364.7000 
398.3000 
399.7000 
546.6000 
276.2000 
253.3000 
334.5000 
337.6000 
374.6000 
390.2000 



0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
0.0000000 
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25 


1 


371.7466 


26 


2 


374.4800 


27 


3 


379.7090 


28 


4 


379.1038 


29 


5 


383.0888 


30 


6 


383.6752 


31 


7 


389.1301 


32 


8 


391.0819 


33 


9 


393.4461 


34 


10 


397.2310 


35 


11 


402.8308 


36 


12 


402.8092 



0.8364 
1 . 2 1 58 
2.0185 
1.4937 
1.9920 
1.7109 
2.459 7 
2.3581 
2.3593 
2.6444 
3.2355 
2.5841 



0.9708 
1.0514 
1.0263 
1.0638 
1 . 1 1 62 
1.4676 
0.7742 
. 69 1 6 
0.8875 
0.9562 
1.0342 
1.0244 



362.0000 
395.3000 
392.9000 
401.0000 
429.8000 
561.4000 
303.5000 
270.2000 
349.2000 
380.9000 
419.0000 
409.9000 



355.1474 
38 5.48 68 
372.9789 
415.2713 
416.1788 
571.8362 
289.4405 
271.9635 
349.1732 
374.1617 
404.068 5 
427.0260 



DO YOU WANT A PRINT OUT OF THE PARAMETERS 
CALCULATED FOR YOUR DATA? TYPE YES OR NO 
YES -p 



SO « 


402.8092 


R = 


2.584084 


A ■ 


0.2000000 


B = 


0.8000000 


C = 


0.2000000 



SEASONAL FACTORS 



FC 1) = 

FC 2) ■ 

F( 3) = 

FC 4) = 

F( 5) ■ 

F< 6> - 

F< 7) - 

FC 8) ■ 

FC 9) - 

FC10) ■ 

FC11) - 

FC12> - 



0.9708323 

1.051404 

1.026345 

1.063781 

1.116244 

1.467569 

0.7741639 

0.6916253 

0.8875313 

0.9561737 

1.034208 

1.024406 
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FOR HOW MANY PERIODS AHEAD YOU NEED FORECASTS? 



NEED FORECASTS ON TTY* DSK OR BOTH?BOTHp 
OUTPUT FILENAME? FOREC^ 

SALES FORECASTS ACE PRODUCTS 



PERIODS AHEAD FORECAST 



1 


393.5689 


2 


428.9492 


3 


421.3779 


4 


439.4964 


5 


464.0557 


6 


613.9041 


7 


325.8439 


8 


292.8908 


9 


378.1469 


10 


409.8639 


11 


445.9859 


12 


444.4060 



STANDARD ERROR OF FORECAST = 12.79439 

ENTER NAME OF VARIABLE.?STOP^ 

MODULE NAME (TYPE "HELP" FOR AID)? 
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Example 2 

In the previous example, the user obtained an actual sales figure for product A for the 37th 
month. He now wishes to update the previous forecast by incorporating this figure (383.1). 
He enters the parameters computed in Example 1. 



- R STATPAK ^ 

TYMSHARE PDP-10 STATPAK VERSION 3.02 31-JAN-72 



MODULE NAME (TYPE "HELP" FOR AID)? TIMSA p 

PROGRAM NAME (TYPE "HELP" FOR AID)? XPOSE^ 

DO YOU HAVE THE PARAMETERS A*B* C, S0*R*F( J) 

READY TO BE USED IN FORECASTING OR DO YOU WANT TO COMPUTE 
THEM ? TYPE YES IF YOU HAVE THE PARAMETERS READY* 
YES ? 

ENTER THE NAME OF YOUR COMPANY OR DIVISION 

NAME SHOULD NOT EXCEED 15 CHARACTERS. 
ACE PRODUCTS ^ 

ENTER NAME OF VARIABLE. ? PRO DA ? 

HOW MANY SEASONS YOU HAVE IN A YEAR? 

ENTER THE VALUES OF A*B*C*SO AND R 
.2 »6 .2 402.8092 2.584084^ 

ENTER THE SEASONAL FACTORS * 5 IN A ROW The user enters the values 

.9708323 1.051404 1.026345 1. 063781 l*116244 p computed in Example 1. 

1.467569 .7741639 .6916253 .8875313 .9561737 ^ 
1.034208 1.02440 6 p 

WHICH SEASON ARE YOU IN RIGHT NOW?. 

_1_ 2> This is the first season of the forecast in Example 1. 

ENTER THE ACTUAL SALES FOR THIS SEASON 
383. l p 

DO YOU WANT A PRINT OUT OF THE PARAMETERS 
CALCULATED FOR YOUR DATA? TYPE YES OR NO 
YES ^ 



SO - 


403.2366 


R = 


2.152748 


A ■ 


0.2000000 


B = 


0.8000000 


C - 


0.2000000 



62 



SEASONAL FACTORS 



F< 1) 




0.9542165 


FC 2) 


= 


1.051404 


F< 3) 


ss 


1.026345 


F< 4) 


s 


1.063781 


F< 5) 


SS 


1.11 6244 


F< 6) 


SS 


1.467569 


F< 7) 


ss 


0.7741639 


F( 8) 


ss 


0.6916253 


F< 9) 


ss 


0.8875313 


FOO> 


ss 


0.9561737 


F<11> 


ss 


1*034208 


F<12> 


ss 


1*024406 



FOR HOW MANY PERIODS AHEAD YOU NEED FORECASTS? 
12p 



NEED FORECASTS ON TTY*DSK OR BOTH? BOTHp 
OUTPUT FILENAME ? NEW4 p 

SALES FORECASTS ACE PRODUCTS 



PERIODS AHEAD 


FORECAST 




1 


426*2280 


Notice that the additional data 


2 


418.2788 


has caused a slight change in 


3 


435.8256 


the forecasts from Example 1. 


4 


459.7224 




5 


607.5741 




6 


322.1707 




7 


289.3109 




8 


373.1702 




9 


404*0898 




10 


439.2944 




11 


437.3362 




12 


409.4253 





DO YOU HAVE THE PARAMETERS A#B*C#SO*R*FC J) 

READY TO BE USED IN FORECASTING OR DO YOU WANT TO COMPUTE 
THEM ? TYPE YES IF YOU HAVE THE PARAMETERS READY. 
STOPp 



MODULE NAME <TYPE "HELP" FOR AID)? 
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SECTION 9 

DISCRIMINANT ANALYSIS: 
THE DISCA MODULE 



The DISCA module consists of one program, DANSS, which performs a discriminant analysis 
and generates linear discriminant functions. The program accepts data from a file written in 
the standard format for STATPAK. 

DANSS tests a set of observations to find whether the populations represented in the data 
vary significantly from each other. The data used by DANSS is grouped by samples from 
different populations. 

The program performs a discriminant analysis of as many as ten groups with a maximum of 
ten variables each. The total number of observations in the analysis must not exceed 3000. 
The number of variables must be greater than the number of groups in the analysis. 

DANSS computes and prints the following statistics: 

• The means of the variables in each group. 

• A pooled dispersion matrix (optional). 

• The common means of each variable. These are the means of the variables considering all 
observations from all groups. 

• The Mahalanobis D-square statistic. This statistic is used to determine whether the means 
of the variables differ significantly in each of the groups. Assuming normal populations, 
the user can compare the D-square statistic to the chi-square value with m(k-l) degrees 
of freedom, where m is the number of variables and k is the number of groups. 

• The coefficients in each discriminant function. 

• Evaluation of each observation on the basis of the discriminant functions developed 
(optional). 
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Example 

The data below corresponds to four groups of office buildings. The first group consists of 
eight buildings, the second and third consist of seven each, and the fourth group has eight 
buildings. Measurements on five different characteristics have been made on each of these 
buildings. 

Building BATHS HGT LOC LSCP OFSPC 



Group 1 



Group 2 



Group 3 



Group 4 



1 


8 


100 


10 


3 


24 


2 


8 


80 


12 


4 


22 


3 


8 


75 


3 


9 


9 


4 


2 


17 


2 


16 


7 


5 


8 


90 


10 


5 


23 


6 


8 


95 


3 


17 


6 


7 


8 


85 


10 


2 


29 


8 


8 


105 


10 


7 


28 


9 


8 


130 


10 


9 


28 


10 


9 


115 


7 


11 


8 


11 


8 


120 


10 


8 


27 


12 


14 


275 


6 


1 


14 


13 


6 


50 


8 


7 


18 


14 


2 


20 


9 


7 


19 


15 


8 


90 


10 


7 


27 


16 


15 


250 


11 


3 


20 


17 


7 


90 


4 


9 


9 


18 


7 


75 


13 


4 


21 


19 


16 


300 


5 


8 


16 


20 


5 


100 


9 


6 


23 


21 


8 


115 


10 


8 


27 


22 


7 


100 


3 


17 


6 


23 


8 


115 


10 


3 


23 


24 


8 


100 


12 


4 


23 


25 


8 


100 


3 


9 


21 


26 


2 


35 


2 


15 


7 


27 


8 


90 


10 


9 


27 


28 


8 


95 


9 


8 


26 


29 


9 


130 


8 


7 


18 


30 


8 


140 


10 


7 


26 



65 

These observations are stored on file BLDG, below. 

- TYPE BLDG t> 

30 5 

BATHS 

8 8 8 2 8 8 

8 8 8 9 8 14 

6 2 8 15 7 7 
16 5 8 7 8 8 
8 2 8 8 9 8 
HGT 

100 80 75 17 90 95 

85 105 130 115 120 275 

50 20 90 250 90 75 

300 100 115 100 115 100 

100 35 90 9 5 130 140 

L0C 

10 12 3 2 10 3 

10 10 10 7 10 6 

8 9 10 11 4 13 

5 9 10 3 10 12 

3 2 10 9 8 10 

LSCP 

3 4 9 16 5 17 

2 7 9 118 1 

7 7 7 3 9 4 

8 6 8 17 3 4 

9 15 9 8 7 7 
OFSPC 

24 22 9 7 23 6 
29 28 28 8 27 14 
18 19 27 20 9 21 
16 23 27 6 23 23 
21 7 27 26 18 26 
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The DANSS program tests whether the groups vary significantly from each other. 

- R STATPAK p 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? PI SCA -^ 

PROGRAM NAME < TYPE "HELP" FOR AID)? DANSS ? 

ENTER NAME OF DATA FILE»? BLDG p 

ENTER NO OF VARIABLES IN ANALYSIS. ?5;> 

ENTER NO OF GROUPS.? 4^ 

NUMBER IN GROUP l?8p 

NUMBER IN GROUP 2?2^ 

NUMBER IN GROUP 3?7p 

NUMBER IN GROUP 4? 8 2 

ENTER NAME OF VARIABLE l.?BATHSp 

ENTER NAME OF VARIABLE 2.? HGT ^ 

ENTER NAME OF VARIABLE 3.? L0C p 

ENTER NAME OF VARIABLE 4.?LSCP^ 

ENTER NAME OF VARIABLE 5.?0FSPCp 

PRINT POOLED DISPERSION MATRIX »?YESp 

EVALUATE OBSERVATIONS ON THE BASIS OF 
DISCRIMINANT FUNCTIONS DEVELOPED. ?YESp 
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GROUP 


1 MEANS 


BATHS 


7.250000 


HGT 


60.87500 


LOC 


7.500000 


LSCP 


7.875000 


OFSPC 


18.50000 


GROUP 


2 MEANS 


BATHS 


7.857143 


HGT 


114.2857 


LOC 


6.571429 


LSCP 


7.142857 


OFSPC 


20.14286 


GROUP 


3 MEANS 


BATHS 


9.285714 


HGT 


147.1429 


LOC 


7.857143 


LSCP 


7.857143 


OFSPC 


17.42857 


GROUP 


4 MEANS 


BATHS 


7.375000 


HGT 


100.6250 


LOC 


8.000000 


LSCP 


7.750000 


OFSPC 


21.37500 



POOLED DISPERSION MATRIX 

ROW BATHS 

9.833104 181.7837 



1.917582 



-6.098901 



4.621566 



ROW HGT 
181.7837 



38 48.040 



13.32692 



-104.0007 



56.80426 



ROW LOC 
1.917582 



13.32692 



11.94505 



-11.16209 



22.60989 



ROW LSCP 
-6.098901 



-104.0007 



-11.16209 



19.61882 



-22.748 63 



ROW OFSPC 
4.621566 



56.80426 



22.60989 



-22.748 63 



62.78 640 
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COMMON 


MEANS 


BATHS 


7.900000 


HGT 


109.4000 


LOC 


7.966667 


LSCP 


7.666667 


OFSPC 


19.40000 



GENERALIZED MAHALANOBIS D-SQUARE 15.67007 



NEED DISCRIMINANT FUNCTION COEFFICIENTS?. 

TYPE TTY*DSK,BOTH OR NONE. 

TTY p The coefficients are printed only on the terminal. 



DISCRIMINANT FUNCTION 1 


CONSTANT 


-27.26358 


COEFF. OF BATHS 


2.588296 


COEFF. OF HGT 


-0.4746080E-01 


COEFF. OF LOC 


1.851615 


COEFF. OF LSCP 


2.42339 5 


COEFF. OF OFSPC 


0.3583278 


DISCRIMINANT FUNCTION 2 


CONSTANT 


-28.8 5336 


COEFF. OF BATHS 


1.596163 


COEFF. OF HGT 


0.U19 79 5E-01 


COEFF. OF LOC 


2.262848 


COEFF. OF LSCP 


2.562995 


COEFF. OF OFSPC 


0.3069434 


DISCRIMINANT FUNCTION 3 


CONSTANT 


-31.16577 


COEFF. OF BATHS 


1.531905 


COEFF. OF HGT 


0.2630825E-01 


COEFF. OF LOC 


2.451687 


COEFF. OF LSCP 


2.674445 


COEFF. OF OFSPC 


0.2271499 


DISCRIMINANT FUNCTION 4 


CONSTANT 


-28.79324 


COEFF. OF BATHS 


1.851321 


COEFF. OF HGT 


-0. 563208 7E-02 


COEFF. OF LOC 


1.933716 


COEFF. OF LSCP 


2.545502 


COEFF. OF OFSPC 


0.4351975 
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EVALUATION OF CLASSIFICATION FUNCTIONS FOR EACH OBSERVATION 

GROUP 1 

PROBABILITY ASSOCIATED WITH LARGEST 

OBSERVATION LARGEST DISCRIMINANT FUNCTION FUNCTION NO. 

1 0.3623551 I 

2 0.4260625 1 

3 0.8548857 1 

4 0.3428 500 4 

5 0.4170793 1 

6 0.3367720 1 

7 0.5585431 1 

8 0.4311859 4 

GROUP 2 

PROBABILITY ASSOCIATED WITH LARGEST 

OBSERVATION LARGEST DISCRIMINANT FUNCTION FUNCTION NO. 

1 0.3520473 4 

2 0.4983430 3 

3 0.3664007 4 

4 0. '678 500 6 3 

5 0.5690117 1 

6 0.4460260 2 

7 0.3777263 4 

GROUP 3 

PROBABILITY ASSOCIATED WITH LARGEST 

OBSERVATION LARGEST DISCRIMINANT FUNCTION FUNCTION NO. 

1 0.732729 5 3 

2 0.4551894 1 

3 0.3981236 2 

4 . 78 1 59 1 2 3 

5 0.4440284 2 

6 0.3814765 4 

7 0.4816400 3 

GROUP 4 

PROBABILITY ASSOCIATED WITH LARGEST 

OBSERVATION LARGEST DISCRIMINANT FUNCTION FUNCTION NO. 

1 0.3498248 2 

2 0.3720276 2 

3 0.5306962 1 

4 0.3366389 2 

5 0.39 55042 4 

6 0.4084113 4 

7 0.3006484 2 

8 0.3713639 2 



MODULE NAME (TYPE "HELP" FOR AID)? 
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SECTION 10 

VARIANCE AND FACTOR ANALYSIS: 
THE AVANC MODULE 



The AVANC module performs an analysis of variance in the program ANVAR, and a factor 
analysis in the program FCTOR. FCTOR accepts data from a file written in the standard 
format for STATPAK. ANVAR accepts data from a file written in the format described below. 



The ANVAR Program 

ANVAR analyzes the variance of a set of observations and separates the total variance into 
components due to different factors. These components are compared by the F-test for any 
significant effect due to individual factors or interactions of factors. 

ANVAR accepts data from a file written in the following form: 



Line Number 


Contents 


Format 


1 


Number of factors, between 
2 and 6, inclusive 


Integer 


2 


Factor name, Factor level 


Five alphanumeric 




(greater than or equal to 2) 


characters, Integer 


3 


Factor name, Factor level 


Alphanumeric, Integer 


k 


Factor name, Factor level 


Alphanumeric, Integer 


k+1 


Up to six observations 


Free form 


k+2 


Up to six observations 


Free form 


k+N 


Up to six observations 


Free form 



NOTE: The total number of observations may not exceed 4000. 

The order of observations on the data file is such that the first factor varies most rapidly, 
then the second, and so on. 

For example, the data file, DATFL, below, contains the data for the two factors A and B. 
A is a four-level factor and B is a three-level factor. 



B, 



B, 



B, 



Ai 


A 2 


A 3 


A 4 


23 


40 


21 


17 


38 


19 


25 


28 


20 


26 


33 


29 
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The form of DATFL is: 

2 

A 4 

B 3 

23 40 21 17 38 19 

25 28 20 26 33 29 

ANVAR computes and prints the following statistics: 

• The grand mean of all observations. 

• The sum of squares, degrees of freedom, and mean squares for each factor and combination 
of factors. 

• An analysis of variance table for each source of variation. The user may specify that one 
or more sources be pooled under the single source, ERROR. The F-ratio is computed 
between each explicit source and ERROR by the formula: 



F-ratio 



Example 1 



mean square due to the source 
mean square due to ERROR 



Ten varieties of wheat were grown on three plots of land each. The following yields of 
bushels were obtained: 



Variety: 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Plot 1: 


7 


7 


14 


11 


9 


6 


9 


8 


12 


9 


Plot 2: 


8 


9 


13 


10 


9 


7 


13 


13 


11 


11 


Plot 3: 


7 


6 


16 


11 


12 


5 


12 


11 


11 


11 



This data is stored on file PLOTS, below. 
- TYPE PLOTS ^ 



2 

VARIE 10 

PLOTS 3 

7 7 14 11 9 6 

7 8 12 9 8 9 

13 10 9 7 13 13 

11117 6 16 11 

12 5 12 11 11 11 
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The user tests the significance of the difference between varieties using ANVAR. 

-R_STATPAKp 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 

MODULE NAME (TYPE "HELP" FOR AID)? AVANC - p 
PROGRAM NAME (TYPE "HELP" FOR AID)? ANVAR p 
ENTER NAME OF DATA FILE.?PLOTS^ 

GRAND MEAN 9.866667 



SOURCE OF 
VARIATION 

1 VARIE 

2 PLOTS 

3 VARIE PLOTS 
TOTAL 



SUMS OF 


DEGREES OF 


MEAN 


SQUARES 


FREEDOM 


SQUARES 


156.1333 


9 


17.34815 


11.46667 


2 


5.733333 


43.86667 


18 


2.437037 


211.4667 


29 





HOW MANY SOURCES YOU WANT TO POOL UNDER ERROR. ?2;p 

ENTER THE INDICES OF THE 2 SOURCES #NOT EXCEEDING 8 IN A ROW. 
2_3;> 



SOURCE OF 
VARIATION 

VARIE 
ERROR 
TOTAL 



SUMS OF 
SQUARES 

156.1333 
55.33333 
211.4667 



DEGREES OF 
FREEDOM 

9 
20 
29 



F 
RATIO 

6.270415 



MODULE NAME (TYPE "HELP" FOR AID)? 
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In this problem, VARIE is the only factor of interest. PLOTS is treated as a dummy second 
factor. In the analysis of variance table, the factor PLOTS and interaction factor VARIE PLOTS 
are pooled for the purpose of significance tests. 

The F-ratio corresponding to VARIE is 6.2704. The user finds from F-test tables that this 
value of F for 9 and 20 degrees of freedom is highly significant at the 5% level. 

Example 2 

The data below represents yields of wheat in bushels. Four different varieties of wheat were 
tried in combination with three different fertilizers and three different pesticides. The experi- 
ment was repeated twice. Thus, the problem calls for a four-factor analysis, where the factors 
are fertilizer, variety, pesticide, and trial. 



Trial 1 



Trial 2 





Vi 


v 2 


v 3 


v 4 


v t 


v 2 


v 3 


v 4 


V! 


v 2 


v 3 


v 4 


Pi 


3 


10 


9 


8 


24 


8 


9 


3 


2 


8 


9 


8 


p 2 


4 


12 


3 


9 


22 


7 


16 


2 


2 


2 


7 


2 


p 3 


5 


10 


5 


8 


23 


9 


17 


3 


2 


8 


6 


3 


Pi 


2 


14 


9 


13 


29 


16 


11 


3 


2 


7 


5 


3 


p 2 


7 


11 


5 


8 


28 


18 


10 


6 


6 


6 


5 


9 


P3 


9 


10 


27 


8 


28 


16 


11 


7 


8 


9 


8 


15 



where F represents fertilizer. 
V represents variety. 
P represents pesticide. 

The data is stored on file WHEAT. 
- TYPE WHEATS 



4 

VARIE 4 (variety) 
FERTI 3 (fertilizer) 
PEST I 3 (pesticide) 
TRIAL 2 (trial) 

3 10 9 8 24 8 

9 3 2 8 9 8 

4 12 3 9 22 7 

16 2 2 2 7 2 

5 10 5 8 23 9 

17 3 2 8 6 3 

2 14 9 13 29 16 
11 3 2 7 5 3 
7 11 5 8 28 18 

10 6 6 6 5 9 

9 10 27 8 28 16 
117 8 9 8 15 



75 



The user tests for the significance of individual factors and the interaction of factors using 
ANVAR. 



- R STATPAK t) 

TYMSHARE PDP-10 STATPAK VERSION 3.01 (11/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? AVANC ? 
PROGRAM NAME (TYPE "HELP" FOR AID)? ANVAR p 
ENTER NAME OF DATA FILE.? WHEAT p 



GRAND MEAN 9.402778 



SOURCE OF 




SUMS OF 


DEGREES OF 


MEAN 


VARIATION 




SQUARES 


FREEDOM 


SQUARES 


1 VARIE 






229.0417 


3 


76.34722 


2 FERTI 






722.6944 


2 


361.3472 


3 VARIE 


FERTI 




1382.083 


6 


230.3472 


4 PESTI 






55.11111 


2 


27.55556 


5 VARIE 


PESTI 




42.00000 


6 


7.000000 


6 FERTI 


PESTI 




13.13889 


4 


3.284722 


7 VARIE 


FERTI 


PESTI 


140.7500 


12 


11.72917 


8 TRIAL 






141.6806 


1 


141.6806 


9 VARIE 


TRIAL 




18.81944 


3 


6.273148 


10 FERTI 


TRIAL 




6.027778 


2 


3.013889 


1 1 VARI E 


FERTI 


TRIAL 


176.9722 


6 


29.49537 


12 PESTI 


TRIAL 




40.77778 


2 


20.38889 


13 VARIE 


PESTI 


TRIAL 


50.55555 


6 


8.425926 


14 FERTI 


PESTI 


TRIAL 


62.63889 


4 


15.659 72 


15 VARIE 


FERTI 


PESTI TRIAL 


151.0278 


12 


12.58565 


TOTAL 






3233.319 
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HOW MANY SOURCES YOU WANT TO POOL UNDER ERROR.? 7^ 

ENTER THE INDICES OF THE 7 SOURCES #N0T EXCEEDING 8 IN A ROW. 
9 10 11 12 13 14 15 t) 
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SOURCE OF 


SUMS OF 


DEGREES OF 


F 


VARIATION 


SQUARES 


FREEDOM 


RATIO 


VARIE 


229.0417 


3 


5.272396 


FERTI 


722.6944 


2 


24.95396 


VARIE FERTI 


1382.083 


6 


15.90735 


PESTI 


55.11111 


2 


1.902935 


VARIE PESTI 


42.00000 


6 


0.48 34069 


FERTI PESTI 


13.13889 


4 


0.2268368 


VARIE FERTI PESTI 


140.7500 


12 


0.8099942 


TRIAL 


141.6806 


1 


9.784193 


ERROR 


506.8194 


35 




TOTAL 


3233.319 
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MODULE NAME <TYPE "HELP" FOR AID)? 



The FCTOR Program 



This program performs a factor analysis of a group of variables. FCTOR determines the 
minimum number of factors required to explain most of the variation in the variables, and 
computes their values. 

FCTOR accepts a maximum of 40 variables and 3000 observations per variable. 

FCTOR computes and prints the following statistics: 

• Means and standard deviations for each variable. 

• Correlation matrix (optional). 

• Eigenvalues and eigenvectors. 

• Initial factor matrix (optional). 

• Rotated factor matrix. 

• Check on communalities. 
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Example 

The user performs a factor analysis on the data below with six variables and 23 observations 
per variable. The data is stored on file FACTR. 



- TYPE FACTR ti 



23 6 

VAR1 

7 13 9 7 6 10 

7 16 9 8 8 9 

11 9 10 11 16 9 

7 8 6 10 8 

VAR2 

7 18 18 13 8 12 

6 19 22 15 10 12 

17 16 15 11 9 8 

18 11 6 9 10 
VAR3 

9 25 24 25 20 30 
11 25 26 26 20 28 

21 26 24 30 16 19 

22 23 28 26 26 
VAR4 

7 15 23 36 7 11 
7 16 24 30 8 11 
30 27 18 19 20 14 

9 18 23 26 15 
VAR5 

1 5 13 12 11 15 10 
15 13 13 13 17 8 

10 14 12 19 18 16 
15 9 7 10 11 
VAR6 

3 6 35 43 12 46 42 
3 5 30 40 10 40 45 
45 31 29 26 31 33 
3 7 36 40 37 42 
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- R STATPAKp 

TYMSHARE PDP-10 STATPAK VERSION 3.01 CI 1/9/71) 



MODULE NAME (TYPE "HELP" FOR AID)? AVANC ? 

PROGRAM NAME (TYPE "HELP" FOR AID)? FCTOR^ 

NAME OF DATA FILE7 FACTR ? 

ENTER NAME OF VARIABLE IN ANALYSI S? VARl r> 

ENTER NAME OF VARIABLE IN ANALYSI S?VAR2p 

ENTER NAME OF VARIABLE IN ANALYSI S?VAR3 ? 

ENTER NAME OF VARIABLE IN ANALYSIS? VAR4p 

ENTER NAME OF VARIABLE IN ANALYSI S?VAR§p 

ENTER NAME OF VARIABLE IN ANALYSI S7 VAR6? 

ENTER NAME OF VARIABLE IN ANALYSIS? SJEOFp The user types STOP 

to stop entering values. 

ENTER VALUE OF CONSTANT TO LIMIT EIGENVALUES.?!^ 

The user allows no eigenvalues less than 1. 



MEANS 




VAR1 


9.304348 


VAR2 


1 2 . 608 70 


VAR3 


£3. 04348 


VAR4 


18.00000 


VAR5 


12.86957 


VAR6 


34.82609 



STANDARD DEVIATIONS 

VAR1 2.704118 

VAR2 4.599 794 

VAR3 5.37230 5 

VAR4 8.333939 

VAR5 3.137810 

VAR6 9.291502 



DO YOU WANT THE CORRELATION MATRIX PRINTED. ?YESp 



CORRELATION COEFFICIENTS 
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ROW VAR1 

1.000000 0.3498 669 



0.1085591 0.1210187 0.2191729 



-0.95489 72 E-01 

ROW VAR2 

0.3498669 
-01 
-0.9100186E-01 



I. 000000 0.3980321 0.3557213 -0.8242919E 



ROW VAR3 

0.1085591 0.3980321 



1.000000 0.4172613 -0.4445602 



-0.781 5379 E-01 



ROW VAR4 

0.1210187 0.3557213 



0.4172613 1.000000 -0.3128765 



-0.5036494 



ROW VAR5 

0.2191729 -0.8242919E-01 -0.4445602 



-0.3128765 1.000000 



-0.22999 63 

ROW VAR6 
-0.9548972E-01 -0 .9 100186E-01 -0. 781 5379 E-01 -0.5036494 

1 .000000 



-0.2299963 



EIGENVALUES 
2.143067 



1.459 717 1.124474 



CUMULATIVE PERCENTAGE OF EIGENVALUES 

0.3571779 0.6004640 0.7878763 



EIGENVECTORS 

VECTOR 1 
0.2174841 

-0.2776769 

VECTOR 2 
0.4772406 

-0.4772047 

VECTOR 3 
0.5471254 

-01 

0.6066640 



0.4629068 0.5167775 0.5572654 -0.2893265 



0.1444153 -0.2745525 0. 5696335E-01 0.6671112 



0.4276067 0.1244227 -0.3650158 0.3236728E 
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DO YOU WANT THE INITIAL FACTOR MATRIX PRINTED. ? YES? 



FACTOR MATRIX ( 3 FACTORS) 

0.5765961 



VARIABLE VAR1 
0.3183797 



VARIABLE VAR2 
0.6776595 

VARIABLE VAR3 
0.7565219 

VARIABLE VAR4 
0.8157930 

VARIABLE VAR5 
-0.4235514 

VARIABLE VAR6 
-0.4064972 



0.5801783 



0.1744808 0.4534393 

-0.3317109 0.1319393 

0.6882240E-01 -0.3870672 

0.80599 53 . 3432265E-01 

-0.5765527 0.6433138 



ITERATION 


VARIANCES 


CYCLE 







0.1928336 


1 


0.3248272 


2 


0.4338 407 


3 


0.4342646 


4 


0.4342651 


5 


0.4342651 


6 


0.4342651 


7 


0.4342651 


8 


0.4342651 



NEED ROTATED FACTOR MATRIX? 

TYPE TTY, DSK>BOTH OR NONE. 

BOTH p The rotated matrix is printed on the terminal and saved on a file. 



OUTPUT FILENAME.? ROTAT E 



The file ROTA T. DA T contains the rotated matrix. 
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ROTATED FACTOR MATRIX ( 3 FACTORS) 

0.8 559149 



VARIABLE VAR1 

0.2992361E-01 0.1922221 



VARIABLE VAR2 

0.1305635 -0.3419365 

VARIABLE VAR3 

0.1675858 -0.7579 546 

VARIABLE VAR4 

0.7478828 -0.481688 6 

VARIABLE VAR5 

0.1142076 0.8799314 

VARIABLE VAR6 
-0.9373980 -0.1784606 



Three distinct factors explain most of the 
variability in this group of six variables. 



0.7492030 
0.3117448 
0.1694775 
0.2070729 
-0.3068892E-01 



CHECK 


ON 


COMMUNAL I TIES 






VARIABLE 


ORIGINAL 


FINAL 


DIFFERENCE 


VAR1 




0.7704356 


0.7704350 


0.5885959E-06 


VAR2 




0.6952731 


0.69 52725 


0. 5662441 E-06 


VAR3 




0.699 7654 


0.6997649 


0.4693866E-06 


VAR4 




0.8200758 


0.8200751 


0.6556511E-06 


VAR5 




0.8302023 


0.8302018 


0.48 428 77E-06 


VAR6 




0.9115057 


0.9115050 


0.70 780 52 E-06 



MODULE NAME (TYPE "HELP" FOR AID)? STOPp 



EXECUTION TIME: 3 MIN. 52.85 SEC 

TOTAL ELAPSED TIMEl 7 MIN. 31.58 SEC. 

NO EXECUTION ERRORS DETECTED 



EXIT 



- TYPE ROTAT.DAT 7, 



The retained data is printed. 



VAR1 

0.2992361E-01 0.1922221 
VAR2 



0.1305635 
VAR3 

0.16758 58 
VAR4 

0.7478828 
VAR5 

0.1142076 
VAR6 
-0.9373980 



-0.3419365 
-0.7579 546 
-0.481688 6 
0.8799314 
-0.1784606 



ROTATED FACTOR MATRIX 
0.8 559149 
0.7492030 
0.3117448 
0.1694775 
0.2070729 
-0.3068892E-01 
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APPENDIX 
ERROR MESSAGES 

Program Message and Description 

General NO SUCH VARIABLE. TRY AGAIN. 

The user tries to retrieve data for a nonexistent variable. 

VARIABLE ALREADY SPECIFIED. TRY AGAIN. 

The user tries to enter a variable name twice in the same analysis. 

DSCRE ILLEGAL CONDITION CODE. TRY AGAIN. 

The user attempts to use a condition code other than GT, GE, LT, LE, NE, 
and EQ. 

NO. OF FREQ. CLASSES EXCEEDS 12. TRY AGAIN. 

The user requests more than 12 frequency classes. 

DREGR VARIABLE ALREADY IN REGRESSION. TRY AGAIN. 

The user tries to enter a variable into the same regression analysis twice. 

THE MATRIX IS SINGULAR. THIS SELECTION IS SKIPPED. 

The matrix of cross products becomes singular. 

TOO FEW OBSERVATIONS. REGRESSION TERMINATED. 

The number of observations per variable is less than the number of variables 
plus 2. 

RGSTP CODE IS OTHER THAN OR 1 OR 2. TRY AGAIN. 

The user specifies an illegal code for an independent variable. 

TOO FEW OBSERVATIONS. ANALYSIS TERMINATED. 

The number of observations per variable is less than the number of variables 
plus 3. 

EITHER THE MATRIX IS SINGULAR OR SUM OF SQUARES NEGATIVE. 
PROBLEM IGNORED. 

The problem contains illegal conditions. 



RGPOL DEGREE EXCEEDS 10. TRY AGAIN. 

The user enters a degree higher than ten. 

NO. OF OBS. IS NOT GREATER THAN DEGREE +1. 

The number of observations is less than the polynomial's degree plus 2. 
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Program 

TSTAT 



Message and Description 

ILLEGAL CODE. TRY AGAIN. 

The user enters a hypothesis code other than 1, 2, 3, or 4. 

FOR HYPOTHESIS WITH CODE 4, THE NUMBER OF OBSERVATIONS 
FOR THE TWO VARIABLES MUST BE EQUAL. EXECUTION ABORTED. 

The user must start over. 



TSTAT MORE THAN 100 OBSERVATIONS PER VARIABLE. EXECUTION 

FSTAT ABORTED. 

The user tries to enter more than 100 observations for a variable. 



UTEST YOU HAVE OVERLOOKED ONE OF THE FOLLOWING: SAMPLE SIZE 1 
CRANK EXCEEDS SAMPLE SIZE 2, SAMPLE SIZE EXCEEDS 100. EXECUTION 
ABORTED. 

The user makes an error in the sample size. 



CRANK ILLEGAL CODE. TRY AGAIN. 

CCORD The user enters a rank code other than or 1. 



CCORD THE NUMBER OF OBSERVATIONS IS DIFFERENT FROM THE NUMBER 

IN DATA FILE. EXECUTION ABORTED. 

The user specifies a number of observations different from the number on 
the data file. 

YOU HAVE OVERLOOKED ONE OF THE FOLLOWING: NUMBER OF 
VARIABLES OR SOURCES EXCEEDS 20, NUMBER OF OBSERVATIONS 
PER VARIABLE EXCEEDS 100. EXECUTION ABORTED. 

The user enters too many variables or observations. 

TRIXP SMOOTHING COEFFICIENT MUST HAVE A VALUE BETWEEN AND 1. 

TRY AGAIN. 

The user enters an illegal value for the smoothing coefficient. 



XPOSE MORE THAN 100 PERIODS IN HISTORY. EXECUTION ABORTED. 

The user enters more than 100 observations. 

H CANNOT EXCEED 60 AND L CANNOT EXCEED 12. RETYPE 
VALUES OF H AND L. 

The user specifies too many periods for generating the forecasting param- 
eters, or designates more than 12 seasons in a year. 
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Program Message and Description 

YOU HAVE OVERLOOKED ONE OF THE FOLLOWING: NUMBER OF 
PERIODS IN HISTORY MUST BE GREATER THAN THE NUMBER OF 
PERIODS USED FOR GENERATING INITIAL VALUES, NUMBER OF 
PERIODS USED FOR GENERATING INITIAL VALUES MUST BE 
GREATER THAN AND A MULTIPLE OF NUMBER OF SEASONS. 
RETYPE VALUES OF H AND L. 

The restrictions on H and L are described on page 56. 



DANSS YOU HAVE OVERLOOKED ONE OF THE FOLLOWING: 

NO. OF GROUPS EXCEEDS 10 
NO. OF VARIABLES EXCEEDS 10 
TOTAL NO. OF OBSERVATIONS EXCEEDS 3000 
EXECUTION ABORTED 

The user enters too many variables or groups. 



ANVAR YOU HAVE OVERLOOKED ONE OF THE FOLLOWING: 

NUMBER OF FACTORS EXCEEDS 6 

NUMBER OF LEVELS FOR A FACTOR IS LESS THAN 2 
TOTAL NUMBER OF OBSERVATIONS EXCEEDS 4000 
EXECUTION ABORTED. 

The user overlooks one of the restrictions on input data. 
NOTE: The total number of observations is the product of the levels of 
all factors. 

FCTOR MORE THAN 40 VARIABLES OR MORE THAN 3000 OBSERVATIONS. 

EXECUTION TERMINATED. 

The restrictions on variables and/or observations are exceeded. 



THERE IS NO EIGENVALUE GREATER THAN THE CONSTANT 
SPECIFIED. EXECUTION ABORTED. 

The user's constant to limit eigenvalues eliminates all the eigenvalues. 



